Factors Which Influence in Many Core Processors

Factors which influence in many core processors ABSTRACT: The applications for multicore and manycore microprocessors as RISC-V are currently useful for the advantages of their friendly nature, compared to previous chips, which have caused a great demand for these multi-core or many-core processors used in parallel computing for fluid emulation mainly on the atmosphere of the earth, and other applications. The result of this research, is the focus in determining the factors that influence high-performance systems, after reviewing and considering various authors, for its realization. This research focuses on two factors which influence on the data in computer systems shared memory (multi-core and many-core architecture) being these topology and memory consistency. Factors which influence the performance Algorithms Topology System operating Architecture Taxonomy Programming Manycore Model Heterogeneous Languages programming Homogeneous Memory Technology Memory Cache Model Software-Defined Error-Correcting Codes Errors in memory often result in system-level crashes. Current error-correction techniques are costly and are oblivious to the underlying data stored in memory. SDECC pushes beyond current error-correction capabilities by combining three layers: • System-level fault tolerance • Error-correcting codes • Side-information about data and instructions in memory à RISC-V! J WASP-SC Austin Harris, Rohith Prakash The University of Texas at Austin SPARK Lab • Goal: defend against utilization side-channels • E.g. shared memory controllers, hardware accelerators • Normalization (e.g. partitioning, worst-case) infeasible • Solution: shape victim’s utilization to be statistically indistinguishable across different inputs • Optimally minimizes slowdown within provably configurable privacy bounds • Modify Rocket to have cores sharing SHA3 accelerator • Send commands through queue with our traffic shaping defense V603 RV64GCP RISC-V MCU Crystal P7 P6 P5 P4 P3 P2 P1 P0 Clock Reset ICE I/F 16-256 SIP Power Crystal KB = System in a Package TDI TDO TCK TMS TRST_L DBREQ_L DBRDY Regulator RAM Oscillator P8 64b Asymmetric w/ ECC RC PLL RISC-V CPU Crypto Oscillator 16KB CPU x 2 Tight for SIMD Couple CPU P9 ROM w/ ECC Lockstep Inter- Tight ICE ICE 256MHz face Couple Debugger ~512MHz Interface FPGA 4KB P10 OTP DMA SCLK 24CH CS# 256KB / Physical Memory Protection (PMP) P11 External Quad IO0 128KB / Flash SPI IO1 64KB Trace Info. Platform Level Interrupt Controller Security Inter- Flash (e.g. Ether) (PLIC) Logic face WP# P12 HOLD# 80-104MHz Uniform Sector P13 Serial Flash Peripheral Interconnect P14 VDDM VSSM SW I2C CAN (1.8V / 3.3V) ADC Fast: 400kbps With GPIO 12bit PWM VSSA = 0V Fast+:1Mbps FD 10MHz VDDA =1.1V~3.3V P15 HS:3.4Mbps Extension VDD = 3.3V VSS = 0V P16 P17 P18 P19 P20 Sub-microsecond Adaptive Voltage Scaling in a 28nm RISC-V SoC Demo: Running user-mode programs in Linux on RISC-V silicon to demonstrate integrated power management Synchronizers Core Programmable Pre-divide Clock counter VOLTAGE AND CLOCK POWER MANAGEMENT SRAM INTEGRATED Voltage GENERATION (0.4 mm2) (0.1 mm2) BIST MEASUREMENT Toggle and Clock Back-Bias Counter Z-scale PMU Programmable I Generation Generator 8KB Scratchpad current mirror load ref SC-DCDC Programmable Clock Toggle Clock counter NWELL Counter Set body bias Vout waveform PWELL Iload Set DC-DC Vout reconstruction Voltage 1GHz Reference To Setting CORE (1.07 mm2) Vector Accelerator scope Power management algorithm Vector Issue Unit Z-scale PMU 1.8V ... Rocket Core loaded into scratchpad memory ... execute power management algorithm Branch Prediction (16KB Vector RF uses eight (compiled from C/C++) Vout custom 8T SRAM macros) 1.0V ... Scalar int int int int int RF FPU Crossbar 48 switched-capacitor Functional units int DC-DC unit cells (64-bit Int. Mul., SP/DP FMA) + Vector Memory Unit DCDC toggle FSM V ref 16KB Scalar 32KB Shared 8KB Vector Rocket Processor Inst. Cache Data Cache Inst. Cache and Vector DC-DC controller (Custom 8T (Custom 8T (Custom 8T Accelerator SRAM Macros) SRAM Macros) SRAM Macros) To core clk Arbiter Adaptive Clock Generator scope Async. FIFO/Level shifters SC-DCDC Unit Cells Adaptive clock between domains SC-DCDC Unit Cells 1.0V generator Digital IO pads to wire-bonded chip-on-board UNCORE To/from off-chip FPGA FSB and DRAM Power Measurement SC-DCDC Control Counters PMU Ben Keller • 5th RISC-V Workshop • November 29, 2016 RV128 The Path to Embedded Exascale Courtesy Kogge et als 2008 UNNI: An open source core for easy transition from ARM Cortex-M0 to RISC-V • Von Neumann architecture with 2-stage pipeline • Optimized for ASIC implementation • Written in SystemVerilog • Low latency interrupt handling with Mikael Korpi tail-chaining and pre-emption OKiM Technologies Full-Featured RISC-V Debug Solution Tim Newsome <[email protected]> • It works in silicon! • Implementations • Download directly to flash • SiFive CorePlex in silicon • IQ-Analog NanoRisc5 on FPGA Demo Setup: • Open Source 32x16 LED Display • Rocket Chip implementation laptop • gdb and OpenOCD code gdb GPIO SiFive board • Black box testsuite OpenOCD SiFive E300U Coreplex • More Information RV32ACFIM JTAG • Debug list at USB https://riscv.org/mailing-lists/ FT2232HL chip Syntacore RISC-V cores demos Alexander Redkin 5th RISC-V workshop Nov 29-30 2016 [email protected] www.syntacore.com Syntacore introduction IP company 1. Develops and licenses energy-efficient programmable cores ˛ With RISC-V ISA 2. Full service to specialize these for the customer needs ˛ Workload analysis/characterization ˛ Workload-specific customization † with tools/compiler support ˛ IP hardening at the required library node ˛ SoC integration and SW migration support 2 Baseline SCRx cores SCRx: a family of the state-of-the art RISC-V compatible synthesizable processor cores ˛ SCR1: RV32IC[EM] ˛ SCR3: RV32IMC[E] <= demo ˛ SCR4: RV32IMCF[D] ˛ SCR5: RV[32|64]IM[A]CFD <=demo Stable, configurable, available for evaluation Baseline: every core can be extended/customized 3 Thank you! 4 ASIP Designer - Automating ASIP Design Architecture Definition, Optimization, and Implementation User-Defined Algorithms User-Defined • ASIP Designer creates full SDK Architecture – Compiler-in-the-loop optimization Algorithms C • Process starts with pre-existing Processor Model 1 example models nMLnML Architectural Optimization and 3 ASIP Synthesis Software Development – RISC-V Starting point RTL Generator • ASIP Designer generates Optimizing C Compiler synthesizable RTL Asm Link FMT ALU OPD Instruction Synthesizable RTL – Performance/Power/Area Set RISCFMT-V ModelMPY OPD VHDL/Verilog • Analysis seeds FMT OPD SH Binary refinement/optimization 2 – ASIP model is refined RTL Refinement Debugger & Instruction Set RTL RTL Simulator Synthesizer Architecture Refinement Profiler Simulator – SDK is automatically adapted 1 SDK Generation – All elements stay in-sync 2 Architectural Optimization ASIC 3 HW Generation FPGA © 2016 Synopsys, Inc. 1 SHAVE: Software/Hardware Assurance Verified End-to-End • Create practical, end-to-end assurance cases for mission critical software/hardware systems that run on COTS hardware. • Case study: implement a crypto extension to RISC-V (like AES- NI), build a thin firmware layer and small application on top, and create an assurance case. Sponsored by the Air Force Research Laboratory (AFRL) and developed with funding from the Defense Advanced Research Projects Agency (DARPA) under contract number FA8650-16-C-7665. Any views, opinions, findings, conclusions and/or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force, the Department of Defense or the U.S. Government. LibreChainEDA Open Source IC Design Tool Flow Scripts Complete FOSS from concept to FPGA 100% Open Source EDA Tools IP-Xact based tool flows Design for Reuse best practices One stop shopping for installation .

Factors Which Influence in Many Core Processors

Increasing Memory Miss Tolerance for SIMD Cores

Tousimojarad, Ashkan (2016) GPRM: a High Performance Programming Framework for Manycore Processors. Phd Thesis

Multi-Core Processors and Systems: State-Of-The-Art and Study of Performance Increase

Understanding and Guiding the Computing Resource Management in a Runtime Stacking Context

Consolidating High-Integrity, High-Performance, and Cyber-Security Functions on a Manycore Processor

Parallel Processing with the MPPA Manycore Processor

A Scalable Manycore Simulator for the Epiphany Architecture

Mapping of Large Task Network on Manycore Architecture Karl-Eduard Berger

Power and Energy Characterization of an Open Source 25-Core Manycore Processor

MALT Is a Truly Manycore Processor  [email protected] ✆ +7 (495) 133 6248

Enhancing Programmability in Noc-Based Lightweight Manycore Processors with a Portable MPI Library

Parallel Programming Model for the Epiphany Many-Core Coprocessor Using Threaded MPI James A