Programmers' Tool Chain

Total Page:16

File Type:pdf, Size:1020Kb

Programmers' Tool Chain Reduce the complexity of programming multicore ++ Offload™ for PlayStation®3 | Offload™ for Cell Broadband Engine™ | Offload™ for Embedded | Custom C and C++ Compilers | Custom Shader Language Compiler www.codeplay.com It’s a risk to underestimate the complexity of programming multicore applications Software developers are now presented with a rapidly-growing range of different multi-core processors. The common feature of many of these processors is that they are difficult and error-prone to program with existing tools, give very unpredictable performance, and that incompatible, complex programming models are used. Codeplay develop compilers and programming tools with one primary goal - to make it easy for programmers to achieve big performance boosts with multi-core processors, but without needing bigger, specially-trained, expensive development teams to get there. Introducing Codeplay Based in Edinburgh, Scotland, Codeplay Software Limited was founded by veteran games developer Andrew Richards in 2002 with funding from Jez San (the founder of Argonaut Games and ARC International). Codeplay introduced their first product, VectorC, a highly optimizing compiler for x86 PC and PlayStation®2, in 2003. In 2004 Codeplay further developed their business by offering services to processor developers to provide them with compilers and programming tools for their new and unique architectures, using VectorC’s highly retargetable compiler technology. Realising the need for new multicore tools Codeplay started the development of the company’s latest product, the Offload™ C++ Multicore Programming Platform. In October 2009 Offload™: Community Edition was released as a free-to-use tool for PlayStation®3 programmers. Experience and Expertise Codeplay have developed compilers and software optimization technology since 1999. We have provided our technology to companies such as Qualcomm, Ageia, Movidius and further undisclosed customers. We have built a large test and QA infrastructure allowing rapid testing of our products on a wide range of source code, operating systems and processors. We are Licensed Sony Middleware providers. Expertise in compiler design: C/C++/shader front-end, optimizations, efficient code generation, enabling general purpose C/C++ programming on specialized GPU-like processors. Expertise on compiler-friendly processor architectures. 2 www.codeplay.com Products and Services at a Glance Take your C++ programs to multicore. Offload lets you easily offload code to additional processor cores. It’s non-disruptive and simple, allowing non-specialist programmers to get the most out of their code. Offload™ for Offload™ for Cell PlayStation®3 Broadband Engine™ Easily offload the most complex games code on Offload provides an effective method for one or more SPU’s. Offload for PlayStation®3 offloading the parts of your code you want to helps you make incredible games for execute on the SPE’s of the Cell Broadband Engine™. Unlock the power of the Cell PlayStation®3, without the high development processor and maximize the performance of costs and long hours. your applications. Offload™ for Multicore Embedded Services The Offload programming model can bring Codeplay’s expert programmers can help you easy-to-use C++ multi-core programming port your software to multi-core. We’ve been capabilities to the world of embedded proces- doing this for years. sors. VectorC Retargetable VectorC Shader Compiler Technology Language Compiler Need a compiler for your multi-core processor? Retargetable Shader Language compilers, for Codeplay build optimizing C/C++ and shader mobile and embedded GPU’s that require low- language compilers for unusual and memory footprints. specialized architectures using our VectorC compiler engine. www.codeplay.com 3 Offload™ Codeplay’s Offload™ is a powerful tool suite combined with a simple programming model to easily and quickly offload large sections of a complex software application onto different processor cores on a multicore CPU. Offload™ lets the compiler do the hard and tedious work offloading the code so that the programmer can concentrate on the application. Offload requires very little modification to the original source int ppu_function () { code. It offloads a section of code from a normal CPU to an int x; // x is now in shared memory accelerator processor. __offload { int y; // y is now in local memory Offload™ is based on Codeplay’s Sieve™ System, an award- y = f (&x, &y); winning system for taking existing C++ software and transforming /* ‘f’ is duplicated and called with it so that it can be compiled with multiple different C compilers and a shared-memory-pointer and distributed across multiple homogeneous or heterogeneous a local-memory-pointer */ processor cores. } } At the heart of Offload™ lies Codeplay’s patented Call-Graph Duplication technology. The Offload™ tool automatically duplicates functions in the Call-Graph that are executed on different types of processors, and intelligently adapts each function so that the data storage and movement is handled correctly for the targeted processor. This removes the need to write the same function differently for the features of each particular processor core, saving the programmer a lot of time and reducing the amount of problems that are likely to arise. Offload™ enables an incremental and non-disruptive migration of code to multicore. Just port your application in small, manageable steps and instantly see the result. This makes Offload™ ideal for porting large legacy codebases. Your application stays written in standard C++ and can be compiled by standard C++ compilers. With Offload™ your applications can be fully multi-core capable without the disruption, extra development time and inflated development costs you may otherwise encounter. For many applications, simply by offloading a section of code from the host processor to an accelerator processor using Codeplay Offload™, a user can gain a 2.5x performance boost for very little work. Offload™ Debugger 4 www.codeplay.com Offload™ for PlayStation®3 Offload™ reduces the time, cost and stress of writing a top quality game for Playstation®3. Codeplay developed Offload™ for PlayStation®3 with input from some of the world’s leading games developers. Building on our early research with the Sieve™ System we progressively implemented functionality to provide what games developers want and can use effectively in the real development of a PlayStation®3 game. Offload™ accelerating a cloth simulation Offload™: Community edition for Licensed PlayStation®3 Developers is available now. Visit http://offload.codeplay.com to register and try Offload™ today – no charge! Programmers’ Tool Chain Offload C++ C/C++ Source Source Offload Tool GCC/SNC/ Offload C++ MSVC Runtime Libraries Linker Debugger: e.g. ProDG, Visual Stdio How Offload™ fits into the Toolchain www.codeplay.com 5 Offload™ for Cell Broadband Engine™ Powered Devices The Cell Broadband Engine™ has found its way into many high-performance computing devices and is expected to appear in many other devices in the future, including consumer electronics. Offload™ can help you unlock the full power of the Cell Broadband Engine™ on whatever device it may be powering, without needing to increase your budget. Offload™ is available for Cell Broadband Engine™ powered devices running under Linux, 2009. To find out more about Linux Offload™ for The Cell Broadband Engine™ and track its availability visit http://offload.codeplay.com Offload™ for Embedded The Offload™ programming model is suitable for all heterogeneous and homogeneous multicore architectures. From powerful super-computing processors to low-cost, low-power mobile chipsets, Offload™ can help programmers easily write software applications for multi-core architectures. Achieve full multi-core performance without any extra software development costs Easily port and re-use existing software and libraries Have the software for your device ready in time for launch Ensure that 3rd party software developers have the best SDK for writing high standard applications If you are designing or already have a multi-core chip and want to ensure it has an extensive library of software applications ready for launch then contact Codeplay to discuss how we can adapt Offload™ and provide an Offload™ SDK suited perfectly to your chip or device. Multicore services Codeplay’s multi-core programming experts are available to assist you in porting your games or application to multicore. Our programmers are thoroughly experienced in porting and optimizing software to multicore with Offload™, from large open-source projects to commercial video games. Codeplay can provide you with external or on-site development assistance to suit a range of budgets. Call us on +44 131 466 0503 or email us at [email protected] to find out what we can do for you. 6 www.codeplay.com Offload™ in Microsoft Visual Studio 8 Offload™ in Eclipse CDT www.codeplay.com 7 VectorC Retargetable Compiler Technology Codeplay’s VectorC compiler platform is a highly configurable C/ C++ compiler engine that can bring full (or subset) C/C++ support to your new processor architecture. VectorC is especially well suited to mobile media processors, low-power chips and high-performance accelerator boards. VectorC technology is the result of more than 10 years of research, development and customer feedback in commercial C/C++, shader and OpenCL projects. Over the years we have adapted our optimizing compiler technology, tools and their integrations for a variety of very different
Recommended publications
  • Microprocessor
    MICROPROCESSOR www.MPRonline.com THE REPORTINSIDER’S GUIDE TO MICROPROCESSOR HARDWARE EEMBC’S MULTIBENCH ARRIVES CPU Benchmarks: Not Just For ‘Benchmarketing’ Any More By Tom R. Halfhill {7/28/08-01} Imagine a world without measurements or statistical comparisons. Baseball fans wouldn’t fail to notice that a .300 hitter is better than a .100 hitter. But would they welcome a trade that sends the .300 hitter to Cleveland for three .100 hitters? System designers and software developers face similar quandaries when making trade-offs EEMBC’s role has evolved over the years, and Multi- with multicore processors. Even if a dual-core processor Bench is another step. Originally, EEMBC was conceived as an appears to be better than a single-core processor, how much independent entity that would create benchmark suites and better is it? Twice as good? Would a quad-core processor be certify the scores for accuracy, allowing vendors and customers four times better? Are more cores worth the additional cost, to make valid comparisons among embedded microproces- design complexity, power consumption, and programming sors. (See MPR 5/1/00-02, “EEMBC Releases First Bench- difficulty? marks.”) EEMBC still serves that role. But, as it turns out, most The Embedded Microprocessor Benchmark Consor- EEMBC members don’t openly publish their scores. Instead, tium (EEMBC) wants to help answer those questions. they disclose scores to prospective customers under an NDA or EEMBC’s MultiBench 1.0 is a new benchmark suite for use the benchmarks for internal testing and analysis. measuring the throughput of multiprocessor systems, Partly for this reason, MPR rarely cites EEMBC scores including those built with multicore processors.
    [Show full text]
  • The Importance of Data
    The landscape of Parallel Programing Models Part 2: The importance of Data Michael Wong and Rod Burns Codeplay Software Ltd. Distiguished Engineer, Vice President of Ecosystem IXPUG 2020 2 © 2020 Codeplay Software Ltd. Distinguished Engineer Michael Wong ● Chair of SYCL Heterogeneous Programming Language ● C++ Directions Group ● ISOCPP.org Director, VP http://isocpp.org/wiki/faq/wg21#michael-wong ● [email protected][email protected] Ported ● Head of Delegation for C++ Standard for Canada Build LLVM- TensorFlow to based ● Chair of Programming Languages for Standards open compilers for Council of Canada standards accelerators Chair of WG21 SG19 Machine Learning using SYCL Chair of WG21 SG14 Games Dev/Low Latency/Financial Trading/Embedded Implement Releasing open- ● Editor: C++ SG5 Transactional Memory Technical source, open- OpenCL and Specification standards based AI SYCL for acceleration tools: ● Editor: C++ SG1 Concurrency Technical Specification SYCL-BLAS, SYCL-ML, accelerator ● MISRA C++ and AUTOSAR VisionCpp processors ● Chair of Standards Council Canada TC22/SC32 Electrical and electronic components (SOTIF) ● Chair of UL4600 Object Tracking ● http://wongmichael.com/about We build GPU compilers for semiconductor companies ● C++11 book in Chinese: Now working to make AI/ML heterogeneous acceleration safe for https://www.amazon.cn/dp/B00ETOV2OQ autonomous vehicle 3 © 2020 Codeplay Software Ltd. Acknowledgement and Disclaimer Numerous people internal and external to the original C++/Khronos group, in industry and academia, have made contributions, influenced ideas, written part of this presentations, and offered feedbacks to form part of this talk. But I claim all credit for errors, and stupid mistakes. These are mine, all mine! You can’t have them.
    [Show full text]
  • Delivering Heterogeneous Programming in C++
    Delivering Heterogeneous Programming in C++ Duncan McBain, Codeplay Software Ltd. About me ● Graduated from Edinburgh University 3 years ago ● Postgrad course got me interested in GPU programming ● Worked at Codeplay since graduating ● Research projects, benchmarking, debuggers ● Most recently on C++ library for heterogeneous systems 2 © 2016 Codeplay Software Ltd. Contents • What are heterogeneous systems? • How can we program them? • The future of heterogeneous systems 3 © 2016 Codeplay Software Ltd. What are heterogeneous systems • By this, I mean devices like GPUs, DSPs, FPGAs… • Generally a bit of hardware that is more specialised than, and fundamentally different to, the host CPU • Specialisation can make it very fast • Can also be harder to program because of specialisation 4 © 2016 Codeplay Software Ltd. Some definitions • Host – The CPU/code that runs on the CPU, controls main memory (RAM), might control many devices • Device – A GPU, DSP, or something more exotic • Heterogeneous system – A host, a device and an API tying them together 5 © 2016 Codeplay Software Ltd. Some definitions • Kernel – Code representing the computation to be performed on the device. • Work group – A collection of many work items executing on a device. Has shared local memory and executes same instructions 6 © 2016 Codeplay Software Ltd. Some definitions ● Work item – A single thread or task on a device that executes in parallel ● Parallel for – Some collection of work items, in many work groups, executing a kernel in parallel. In general, cannot return anything, and must be enqueued asynchronously 7 © 2016 Codeplay Software Ltd. Example heterogeneous device ● CPUs today can execute instructions out-of-order, speculative execution, branch prediction ● Complexity hidden from programmer ● Contrast with e.g.
    [Show full text]
  • Especial Starfox a Través De Foros, Del Blog Y De Nuestra Página De Facebook
    por Skullo ...............................................................03 .................................................04 Prince of Persia por Alf ...................................................... 05 Action Fighter por Skullo .................................................... 08 Battletoads por NesBeer..................................................... 11 Ren & Stimpy Quest for the Shaven Yak por Valveider ..... 15 Hybrid Heaven por El Mestre .............................................. 17 por Skullo ............................................. 20 Entrevista a Multitap.es por Skullo ................................... 21 32X por Skullo ..................................................................... 23 Double Dragon por Skullo .................................................. 27 Tails por Skullo ................................................................... 32 La prostituta de los chinos (parte 2) por NesBeer ............ 36 Starfox por Skullo ............................................................... 41 Maquetación: Skullo Todas las imágenes y nombres que aparecen en esta revista son propiedad de sus respectivas marcas, y se usan simplemente a modo informativo. Las opiniones mostradas en esta revista pertenecen al redactor de cada artículo. Bienvenidos a Bonus Stage Magazine número 16, un número que esperamos os ayude a pasar los días de vacaciones que muchos de vosotros tendréis. Como habréis observado el especial de este número está centrado en la saga Starfox, de manera que le hemos dedicado unas
    [Show full text]
  • The Making of Star Fox Welcome to My First Newshounds Article of 2019
    The Making Of Star Fox Welcome to my first Newshounds article of 2019-2020! It’s that time once again of a new school year, which means a whole new batch of articles to publish! There’s a bunch of great articles that I’ve planned for this year, so before we dive into this article, let’s recap which articles I published last year! We started off the previous year with a Q&A about myself, where students could give me questions to answer! Let’s say there were some interesting ones in there! Christmas was approaching at a rapid pace and before the big day arrived, we delved into the history of Apple’s iPhone and we realised how far technology has come since the late 2000’s. After that, we entered the construction site to delve into the deepest blueprints of Blackpool Pleasure Beach’s latest rollercoaster ICON! Finally, we all came as one to choose which article to be published next. We ended up taking look at the history of Sonic The Hedgehog at Alton Towers, which covered over a quarter of a century to produce! Anyways enough rambling on, let’s get the ball rolling! Nowadays, gaming has come an extremely long way since its inception, but back in the ‘90’s we were restricted in technology compared to what we have today. However, today we are going to look at singlehandedly THE game that brought the world of 3D to home consoles. Welcome to the making of… WARNING: There may some terms that may be hard to understand.
    [Show full text]
  • An Introduction to GPU Computing
    An Introduction to GPU Computing iVEC Supercomputer Training 5th - 9th November 2012 Introducing the Historical GPU Graphics Processing Unit (GPU) n : A specialised electronic circuit designed to rapidly manipulate and alter memory in such a way as to accelerate the building of images in a frame buffer intended for output to a display. Introducing the Modern GPU Graphics Processing Unit (GPU) n : A general purpose electronic circuit designed to rapidly manipulate and alter memory in such a way as to accelerate computational algorithms that have fine-grained parallelism. GPU Computing Motivation But what about... Central Processing Unit (CPU) n : the portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system GPU Computing Motivation : Arithmetic Performance GPU Computing Motivation How is that possible? - GPUs have less cache and logic, more arithmetic - GPUs execute lots of simple threads - GPUs are physically bigger Nehalem Die (263mm2) Fermi Die (530mm2) GPU Computing Motivation : Power Efficiency GPU Computing Motivation : Memory Bandwidth GPU Computing Motivation : Summary Pros : - high arithmetic performance - high memory bandwidth - power efficient computation Cons : - separated by PCI Express bus - requires fine grain parallelism - requires high arithmetic intensity - requires light weight threads - requires additional software development For the right algorithm, additional time spent on code development can yield significant
    [Show full text]
  • The Impact of Multichannel Game Audio on the Quality of Player Experience and In-Game Performance
    The Impact of Multichannel Game Audio on the Quality of Player Experience and In-game Performance Joseph David Rees-Jones PhD UNIVERSITY OF YORK Electronic Engineering July 2018 2 Abstract Multichannel audio is a term used in reference to a collection of techniques designed to present sound to a listener from all directions. This can be done either over a collection of loudspeakers surrounding the listener, or over a pair of headphones by virtualising sound sources at specific positions. The most popular commercial example is surround-sound, a technique whereby sounds that make up an auditory scene are divided among a defined group of audio channels and played back over an array of loudspeakers. Interactive video games are well suited to this kind of audio presentation, due to the way in which in-game sounds react dynamically to player actions. Employing multichannel game audio gives the potential of immersive and enveloping soundscapes whilst also adding possible tactical advantages. However, it is unclear as to whether these factors actually impact a player’s overall experience. There is a general consensus in the wider gaming community that surround-sound audio is beneficial for gameplay but there is very little academic work to back this up. It is therefore important to investigate empirically how players react to multichannel game audio, and hence the main motivation for this thesis. The aim was to find if a surround-sound system can outperform other systems with fewer audio channels (like mono and stereo). This was done by performing listening tests that assessed the perceived spatial sound quality and preferences towards some commonly used multichannel systems for game audio playback over both loudspeakers and headphones.
    [Show full text]
  • Open Standards Enable Continuous Software Development in the Automotive Industry
    AUTOSENS PROCEEDINGS 1 Open standards enable continuous software development in the automotive industry Markus Glaser, Charles Macfarlane, Benjamin May, Sven Fleck, Lukas Sommer, Jann-Eve Stavesand, Christian Weber, Duong-Van Nguyen, Enda Ward, Illya Rudkin, Stefan Milz, Rainer Oder, Frank Bohm,¨ Benedikt Schonlau, Oliver Hupfeld, Andreas Koch Abstract—Until recently, the automotive industry was primarily focused on design, development of electronics and mechanics, and manufacturing. Nowadays the software component of new vehicles has become a large portion of the development cost, driven by adding numerous new sensors, intelligent algorithms, very powerful and specialized processors and a highly complex user experience to the vehicle. In addition, requirements for high-performance and real-time processing as well as the vehicle’s distributed architecture bring challenges, but moreover supply chains further complicate development. In this article a high-level overview of vehicle development is provided, followed by a deep dive in the different software development processes, languages and tools that are required for efficient development of the next generation of intelligent vehicles. This paper especially explores SYCLTM, an open standard from The KhronosTM Group for high-performance programming of heterogeneous multicore processor system. Index Terms—Automotive, continuous development, open standards, SYCL, CUDA, AI driven, software driven, ADAS, heterogeneous computing, AI, Machine Learning F 1 INTRODUCTION ing open standards (e.g. SYCL). To this end an overview of the automotive landscape is given, where functional safety Modern cars utilize 100+ ECUs and reached 100M lines standards are now the norm and where open standards of code by 2015 [1], [2]. Code complexity is expected to for software are becoming the solution for the automotive increase exponentially, resulting in over 650M lines of code industry, achieving the demands of ADAS developers and by 2025 [3] for every vehicle and over 1bn lines to achieve a overcoming software development challenges.
    [Show full text]
  • SYCL for CUDA
    DPC++ on Nvidia GPUs DPC++ on Nvidia GPUs Ruyman Reyes Castro Stuart Adams, CTO Staff Software Engineer IXPUG/TACC Products Markets High Performance Compute (HPC) Integrates all the industry C++ platform via the SYCL™ Automotive ADAS, IoT, Cloud Compute standard technologies needed open standard, enabling vision Smartphones & Tablets to support a very wide range & machine learning e.g. of AI and HPC TensorFlow™ Medical & Industrial Technologies: Artificial Intelligence Vision Processing The heart of Codeplay's Machine Learning compute technology enabling Big Data Compute OpenCL™, SPIR-V™, HSA™ and Vulkan™ Enabling AI & HPC to be Open, Safe & Company Customers Accessible to All Leaders in enabling high-performance software solutions for new AI processing systems Enabling the toughest processors with tools and middleware based on open standards Established 2002 in Scotland with ~80 employees And many more! 3 © 2020 Codeplay Software Ltd. Summary • What is DPC++ and SYCL • Using SYCL for CUDA • Design of SYCL for CUDA • Implementation of SYCL for CUDA • Interoperability with existing libraries • Using oneMKL on CUDA • Conclusions and future work 4 © 2020 Codeplay Software Ltd. What is DPC++? Intel’s DPC++ • Data Parallel C++ (DPC++) is an open, standards-based alternative to single-architecture proprietary languages, part of oneAPI spec. SYCL 2020 • It is based on C++ and SYCL, allowing developers to reuse code across hardware targets (CPUs and accelerators such as GPUs and FPGAs) and also perform custom tuning for SYCL 1.2.1 a specific accelerator. 5 © 2020 Codeplay Software Ltd. Codeplay and SYCL • Codeplay has been part of the SYCL community from the beginning • Our team has helped to shape the SYCL open standard • We implemented the first conformant SYCL product 6 © 2020 Codeplay Software Ltd.
    [Show full text]
  • Comparing SYCL with HPX, Kokkos, Raja and C++ Executors the Future of ISO C++ Heterogeneous Computing
    Comparing SYCL with HPX, Kokkos, Raja and C++ Executors The future of ISO C++ Heterogeneous Computing Michael Wong (Codeplay Software, VP of Research and Development), Andrew Richards, CEO ISOCPP.org Director, VP http://isocpp.org/wiki/faq/wg21#michael-wong Head of Delegation for C++ Standard for Canada Vice Chair of Programming Languages for Standards Council of Canada Chair of WG21 SG5 Transactional Memory Chair of WG21 SG14 Games Dev/Low Latency/Financial Trading/Embedded Editor: C++ SG5 Transactional Memory Technical Specification Editor: C++ SG1 Concurrency Technical Specification http:://wongmichael.com/about SC2016 Agenda • Heterogensous Computing for ISO C++ • SYCL • HPX (slides thanks to Hartmut Kaiser) • Kokkos (slides thanks to Carter Edwards, Christian Trott) • Raja (Slides thanks to David Beckingsale) • State of C++ Concurrency and Parallelism • C++ Executors coming in C++20 • C++ simd/vector coming in C++20 2 © 2016 Codeplay Software Ltd. The goal for C++ •Great support for cpu latency computations through concurrency TS- •Great support for cpu throughput through parallelism TS •Great support for Heterogeneous throughput computation in future 3 © 2016 Codeplay Software Ltd. Many alternatives for Massive dispatch/heterogeneous • Programming Languages Usage experience • OpenGL • DirectX • OpenMP/OpenACC • CUDA • HPC • OpenCL • SYCL • OpenMP • OpenCL • OpenACC • CUDA • C++ AMP • HPX • HSA • SYCL • Vulkan 4 © 2016 Codeplay Software Ltd. Not that far away from a Grand Unified Theory •GUT is achievable •What we have is only missing 20% of where we want to be •It is just not designed with an integrated view in mind ... Yet •Need more focus direction on each proposal for GUT, whatever that is, and add a few elements 5 © 2016 Codeplay Software Ltd.
    [Show full text]
  • International Workshop on Opencl and SYCL Call for Submissions
    TH International Workshop on OpenCL and SYCL 9 27-29 April 2021 | LRZ Munich, Germany Workshop Chairs Call for Submissions Chair: Simon McIntosh-Smith, Professor of High-Performance Computing and IWOCL & SYCLcon 2021 will be the 9th annual gathering of the international community Head of the HPC Research Group, of developers, researchers, suppliers and Khronos Working Group members to share University of Bristol. Local Co-Chairs: best practice, and to advance the use and evolution of the Open Computing Language Martin Schreiber, Lecturer and Researcher (OpenCL), and the SYCL standard for C++ programming of heterogeneous platforms and at the Chair of Computer Architecture and their associated ecosystems. Parallel Systems, Technical University This pioneering workshop has been attracting an International audience of leading of Munich and Christoph Riesinger, academic and industrial experts since 2013 and is the premier forum for the community Application Engineer, Intel. SYCLcon to present, discuss and learn about applying OpenCL and SYCL to address the issues Chair: Michael Wong, VP Research faced in High Performance Computing across a wide range of application domains. & Development, Codeplay Software. Proceedings Chair: Andrei Poenaru, Researcher, University of Bristol. Topics of Interest Organising Chair: Tim Lewis, IWOCL. You are cordially invited to contribute and participate in this workshop. Topics of interest include but are not limited to: Program Committee Ben Ashbaugh, Intel. Patrick Atkinson, • Scientific and high-performance computing (HPC) applications NVIDIA. David Beckingsale, Lawrence • Machine Learning Training and Inferencing Livermore National Laboratory. Thomasz • Development tools, including debuggers, profilers and reference implementations Bednarz, CSIRO Data61. Ben Bergen , Los • HPC frameworks and supporting libraries developed on top of OpenCL, SYCL, Alamos National Laborator.Alex Bourd, SPIR-V, Vulkan and other parallel C++ paradigms Qualcomm.
    [Show full text]
  • Educational Goals for Embedded Systems in the Multicore Era
    AC 2009-414: EDUCATIONAL GOALS FOR EMBEDDED SYSTEMS IN THE MULTICORE ERA James Holt, Freescale Semiconductor, Inc. Jim leads the Multicore Design Evaluation team for Freescale’s NMG/NSD division. Jim has 27 years of industry experience focused on distributed systems, microprocessor and SoC architecture, design verification, and optimization. Jim is an IEEE Senior Member, and is a board member for the Multicore Association. He is also chair of the Integrated Systems & Circuits Science area for the Semiconductor Research Corporation (SRC), and chair of the Multicore Resource API Working group for the Multicore Association. Jim earned a Ph.D. in Electrical and Computer Engineering from the University of Texas at Austin, and an MS in Computer Science from Texas State University. Hongchi Shi, Texas State University, San Marcos Hongchi Shi is Professor and Chair of the Computer Science Department at Texas State University-San Marcos. Prior to joining Texas State University, he has been an Assistant/Associate/Full Professor of Computer Science and Electrical and Computer Engineering at the University of Missouri. He obtained his BS degree and MS degree in Computer Science and Engineering from Beijing University of Aeronautics and Astronautics in 1983 and 1986, respectively. He obtained his PhD degree in Computer and Information Sciences from the University of Florida in 1994. Hongchi Shi's research interests include parallel and distributed computing, wireless sensor networks, neural networks, and image processing. He has served on many organizing and/or technical program committees of international conferences in his research areas. He is a member of ACM and a senior member of IEEE.
    [Show full text]