Jakob Engblom, PhD, Product Management Engineer, Simics team, Intel, Stockholm, Sweden 2019-05-22
Computer Architecture - Uppsala - 2019-05-22 4 My Background
Jakob Engblom . Datavetenskap, Uppsala: D92 . PhD, Computer Systems, Uppsala . Product Management Engineer, Intel System Simulation team , Sweden – Previously at IAR, Virtutech, Wind River . Intel Software Evangelist – Simulation . https://software.intel.com/en-us/meet-the- developers/evangelists/team/jakob- engblom . http://engbloms.se/jakob.html
Computer Architecture - Uppsala - 2019-05-22 5 What Does Intel Do?
• Processors • Intel® Xeon® Phi™ • SSD • Ethernet • Chipsets • Intel® Xeon® • 3D XPoint™ • WiFi • Processors • Intel® Optane™ • Bluetooth • Chipsets • GNSS • Accelerators • 5G Laptop and Server Storage Connectivity desktop
• SoC-FPGA • Movidius • Processors • Development tools • FPGA • Nervana • Gateways • Compilers • FPGA-CPU • MobilEye • Connectivity • Simulation solutions • Xeon • Linux & Windows drivers • FPGAs • UEFI & BIOS
FPGA AI etc. IoT Software
Computer Architecture - Uppsala - 2019-05-22 6 What I do: Product Management
Market communications (“PR”)
Engineering Product Management Customer
Support
Sales
Computer Architecture - Uppsala - 2019-05-22 7 What’s In a Processor? What’s in a ”Computer”?
(Main) Processor cores
. Run user-visible OS and applications Main memory - RAM Graphics and display Audio and media processing
. Camera, microphone, speakers, image processing, ... Storage – ”Disk”
. SATA, NVMe, M.2., SCSI, PCIe, ... Input and output
. Local devices: USB, Thunderbolt, Serial, Bluetooth, ...
. Remote: Ethernet, WiFi, ...
Computer Architecture - Uppsala - 2019-05-22 9 Once Upon a Time...
The PROCESSOR was the essential part of a system It measured the goodness of the machine in terms like . Megahertz . Instructions per cycle . Cache size The supporting chips did some basic stuff to make the processor do its job... A better computer meant a better processor (mostly)
Computer Architecture - Uppsala - 2019-05-22 10 2009: Intel® Core™ i7 Processor: Still a Processor
Intel® Core™ i7-960 Processor (2009) . http://hexus.net/tech/reviews/cpu/1618 7-intel-core-i7-x58-chipset-systems-go- . The processor chip is a processor with fsb-invited/?page=3 minimal connections to the rest of the system . Cores + cache . Memory controller – just moved on-chip! . Intel QuickPath Interconnect (QPI) – link to the rest of the system
Computer Architecture - Uppsala - 2019-05-22 11 2009: Intel® X58 Express Chipset
IOH (I/O Hub) . QPI to processor (on the previous slide) . Graphics card interface . Connection to ICH10 ICH10 (I/O Controller Hub) . DMI link to the IOH . Main IO chip . SATA, Audio, USB, PCIe, Ethernet
Computer Architecture - Uppsala - 2019-05-22 12 2018: Intel® Core™ i9 Processor: Lots of Other Stuff
Intel® Core™ i9-9900K Processor (2018): . High-end eight-core desktop processor On the chip: . Graphics + media block bigger than four processor cores – 3D graphics, display, video decode, ... . “System Agent” - Memory controller & IO
about 2 processor cores https://en.wikichip.org/wiki/intel/core_i9/i9-9900k . L3 cache (2MB per core) not very large
Computer Architecture - Uppsala - 2019-05-22 13 2017: Intel® Z370 Chipset
Compared to Core i7-960, 7th and 8th gen processors have added:
. Integrated PCI Express (PCIe), version 3
. Integrated GPU, multiple displays, video decoding hardware, …
. Secure boot and other security functions (not shown)
. DMI 3.0 connection has about 160x the bandwidth of the QPI from 2009
Intel® Z370 chipset is a single chip (PCH, Platform Controller Hub):
. 24 Additional PCIe version 3 lanes
. USB 3 and USB 2, 14 ports total
. Storage connections: SATA, eSATA, RAID, PCIe/NVMe, Intel® Optane™ Memory, eMMC, SDXC, …
. Advanced sound processing, including onboard DSP
. Management Engine (ME)
. Programming guide is 1700+ pages long!
Additional functions added on PCIe
. Wireless module: Wifi, Bluetooth, GNSS, …
Computer Architecture - Uppsala - 2019-05-22 14 Pure CPU chip: IO and Cache Dwarf the Cores
Intel® Core™ i7-6950K Processor (2016): . Memory controller as big as 5 cores! – Four (4) channels of DDR4 2400! . ”Queue, Uncore, and I/O” bigger than the 25MB L3 cache
http://www.anandtech.com/show/10337/the-intel-broadwell-e-review- core-i7-6950x-6900k-6850k-and-6800k-tested-up-to-10-cores
Computer Architecture - Uppsala - 2019-05-22 15 Note that these instructions Innovation: The Instruction Set Itself are virtually useless unless there is also supporting software libraries, SDKs, and compilers. People will not use them on their own Instruction set architecture are Recent Intel examples: without help. evolving a quick pace . Intel® Core™ i7-4xxx Processor: . Better instructions = many times faster – AVX2 vector processing computations on specific tasks – BMI1, BMI2 bit-manipulation instructions – https://software.intel.com/en- us/blogs/2017/01/09/resetting-the-lowest-n- Main trends: set-bits . Vector compute = more math per cycle – SMAP - Supervisor Mode Access Prevention – TSX – Transactional memory . Virtualization = faster, more efficient, more capable virtual machines . Intel® Core™ i7-5xxx Processor – RDSEED – Hardware random-number seed . Cryptography = crypto on CPU not on accelerator . Intel® Core™ i7-6xxx Processor – SGX – Software Guard Extensions . Security = better SW-SW protection
Computer Architecture - Uppsala - 2019-05-22 17 Example of the effect of ISA: AVX
Without AVX: With AVX and AVX512:
Source: https://www.anandtech.com/show/13400/intel-9th-gen-core-i9-9900k-i7-9700k-i5-9600k-review
Computer Architecture - Uppsala - 2019-05-22 18 Example: Instruction Set Flags - CPUID
See feature flags & instructions Software should check feature availability before executing them . CPUID instruction is crucial! . Software adopt dynamically to the machine it is running on Using Intel® Xtreme Tuning Utility . https://downloadcenter.intel.com/do wnload/24075/Intel-Extreme-Tuning- Utility-Intel-XTU-
Computer Architecture - Uppsala - 2019-05-22 19 Innovation Area: Networking
Ethernet speeds keep increasing Network interfaces add intelligence . 10GbE on Base-T: 2006 . Packet processing offload . 100GbE on Base-T: 2010 . Integrated switches . 40GbE & 25GbE . Virtualization – one physical interface appears as multiple virtual interface WiFi speeds keep increasing directly connected to virtual machines Cellular 4G/LTE/5G speeds keep – Intel® VT-d for VM access to hardware increasing – PCIe SR-IOV for multiple virtual devices in a single physical device = more bytes going in and out than ever before!
Computer Architecture - Uppsala - 2019-05-22 20 Innovation Area: Connectors
USB Type C m.2 connector . Multiple interfaces in one connector . Multiple interfaces in – USB one connector – Thunderbolt – PCIe – HDMI – SATA – Displayport – USB – Power – I2C, Serial, PCM, ... More flexible computer design More compact SSDs Small ports = thinner machines Add-in other functions More user friendly like modems
Computer Architecture - Uppsala - 2019-05-22 21 Summary
Innovation in computing today is Buying a better machine: really in the platform capabilities . Faster disk and interface: – Once it was spinning disk on IDE Transistor budget being used to: – Then, SSD on SATA . Integrate previously separate functions – Now, SSD on M.2. PCIe NVMe onto processor die . More and faster external IO – Memory controller, GPU, IO, … – USB 3, USB 3.1 Type C, Thunderbolt, … . Add new functions to the platform . Higher display resolution, multiple displays, high dynamic range (HDR), … without increasing number of chips . Better network connectivity . Add new instructions to resolve software – WiFi standards, cellular standards, Bluetooth, bottlenecks Bluetooth Low Energy (BLE), …
Computer Architecture - Uppsala - 2019-05-22 22 What makes a system Tick? Answer: Firmware
Inside the processor, PCH, and other chips are many small programmable cores . Any semi-complicated subsystem has a programmable core inside The software running on these cores is called firmware . It is not hardware . But it is not as soft as software . Firmware – long-standing name for close-to-hardware software
http://www.ganssle.com/book.htm Disclosure: I wrote a chapter in the book
Computer Architecture - Uppsala - 2019-05-22 24 Example: SSD
Intel® Solid-State Drive Toolbox . Looking at the Intel® 600p M.2. PCIe NVMe drive in one of my PCs . Note the ”firmware revision” – There is a processor (or several) in there! – Updates are available to download and install
Computer Architecture - Uppsala - 2019-05-22 25 Example: Keyboard with 32-bit Processor
The tech specs for the Corsair* Gaming K95 keyboard: . ”32-bit ARM* Processor” . ”Display Controller” Conclusion: . If it is ”smart” or capable of acting independently, it has firmware in it
http://www.corsair.com/en-us/corsair-gaming-k95-rgb- mechanical-gaming-keyboard-cherry-mx-red
*Other names and brands may be claimed as the property of others
Computer Architecture - Uppsala - 2019-05-22 26 Example: Functionality Upgrades via Firmware
Sony* Playstation* 4 (PS4) HDMI controller upgraded from HDMI 1.4 to 2.0 to support HDR - using a firmware update! Hardware had the bit-pushing ability needed, but not the protocol and copy-protection bits
http://arstechnica.com/gaming/2016/09/whats-up-with- ps4s-surprise-firmware-update-is-4k-around-the-corner/
Computer Architecture - Uppsala - 2019-05-22 28 ”Russian Dolls”
Applications Memory Operating system
Main Main core core
The operating system and user sees a device on Timer Crypto the PCIe bus, with memory-mapped IO just like all other devices
USB PCIe Programming registers Programming
Serial Disk Advanced Device
Computer Architecture - Uppsala - 2019-05-22 29 ”Russian Dolls” Inside the device, you have a complete computer system, often with serial ports for debug access, and maybe running a complete OS!
Applications Memory Operating system
Firmware Main Main core core Memory
Small OS Serial Timer Crypto
Hidden Hidden Timer USB PCIe core core
Programming registers Programming IO Serial Disk Advanced Device
Computer Architecture - Uppsala - 2019-05-22 30 What Types of Processors are we Talking About?
Firmware processor cores cover a Classic embedded cores: broad range and is among the most . i8051, H8, … diverse ecosystem of cores around Standard embedded cores: Many driving factors for core choice: . ARM*, LEON* (SPARC*), MIPS*, ARC*, … . Size of core Digital signal processing cores: . Size of code . Ceva*, Tensilica*, ... . Speed of processing Full Intel® Architecture cores . Legacy of the subsystem Specialized custom cores . Programmability only rarely top of the list of concerns for hardware designers . Networking engines, pattern matching engines, ... *Other names and brands may be claimed as the property of others
Computer Architecture - Uppsala - 2019-05-22 31 How Firm is Firmware?
Originally, firmware was very firm Firmware today mostly stored in changeable memory . Apollo Guidance Computer used hand- woven core memory to store programs . FLASH & EEPROM (Erasable Electrically – Software freeze four months before Programmable ROM) launch to allow it to be manually wired . Inside computers, FW is often loaded by . ”Mask ROMs” were added on top of the main processor microcontrollers of yore . ROM chips that could not be changed were common up until 1990s Firmware tends to change less often since it is kind of part of hardware and changes carry risk
Computer Architecture - Uppsala - 2019-05-22 32 Firmware Loading & Location
Fixed inside the device Applications Memory . ROM (maybe), FLASH Operating system . Often just a bootloader Main Main Dynamically loaded core core Firmware RAM . UEFI, BIOS, OS bootloader Serial starts and loads firmware Timer PCIe Hidden Hidden onto devices in early boot core core Timer . Device driver loads USB BOOT FLASH BOOT Local
firmware during OS boot registers Programming ROM FLASH IO . Stored in main SoC on- Serial Disk Advanced Device chip or off-chip boot flash, or in OS disk file system
Computer Architecture - Uppsala - 2019-05-22 33 Your Computer – A Distributed System
PC Display Simplified diagram! Most often each USB endpoint contains a very OS small embedded processor FW LAN too! Actual Drivers Display display proc Main USB core Ethernet port
FW FW FW FW FW USB port Keyboard Audio Graphics Thunder Thunder unit unit bolt bolt Audio jack
Computer Architecture - Uppsala - 2019-05-22 34 Security?
If a subsystem with firmware has a channel to the outside, it is part of the system security perimeter Example: Using WiFi chip firmware to take over phones . https://googleprojectzero.blogspot.se/2017/04/over-air-exploiting- broadcoms-wi-fi_4.html . ARM* Cortex-R4* processor “The first blog post will focus on exploring the Wi-Fi SoC itself; we’ll discover and exploit vulnerabilities which will allow us to . Gets firmware code from main processor remotely gain code execution on the chip. In the second blog post, we’ll further elevate our privileges from the SoC into the the operating system’s kernel. Chaining the two together, we’ll . Code not written securely demonstrate full device takeover by Wi-Fi proximity alone, requiring no user interaction. . All memory RWX – no MMU defense *Other names and brands may be claimed as the property of others
Computer Architecture - Uppsala - 2019-05-22 35 Summary
Firmware is everywhere! Software powers modern electronics in a very deep sense A ”processor” is not ”a processor” – it is a heterogeneous semi-autonomous collective of many processors Most of these processors are not exposed to end users or operating systems – they look and work like fixed-function hardware
Computer Architecture - Uppsala - 2019-05-22 36 Power Management
37 What do we Want?
More performance ... With lower power consumption ... Giving off less heat = no fan ... With longer battery life ... Weighing less
NOT all that easy to do!
Computer Architecture - Uppsala - 2019-05-22 38 Power Efficiency Gains come from Many Sources
Manufacturing process Circuit design Computer architecture
System optimization Power management
Computer Architecture - Uppsala - 2019-05-22 39 Where Does the Power Go?
2 푃 = 푃푑푦푛푎푚푖푐 + 푃푙푒푎푘푎푔푒 푃푑푦푛푎푚푖푐 = 퐶푉 푓
Total power: Dynamic power: . Basic capacitance . Dynamic power during actual switching . × Voltage squared {V affects 푓푚푎푥} . Leakage power from just being powered- on . × Frequency Note: . Since increased frequency needs higher voltage, as a rule of thumb we have: 2 푃푑푦푛푎푚푖푐~ 푓
Computer Architecture - Uppsala - 2019-05-22 40 Better Silicon
Process technology, circuit design, Moore’s law . Transistors that use less power individually . Lower drive voltages – A processor used to run on 5V, then 3.3V, now down to < 1V – Interesting side-effect: – With a 95W power consumption, we have to feed 100A+ – Approximately half of all “pins” on a package are for power distribution . Lower leakage power All things equal, the same design on a better process = lower power or higher frequency at the same power . Over time, the same “nanometer” process is tuned and improved
Computer Architecture - Uppsala - 2019-05-22 41 Better Architecture
Allow the system to avoid waste Many slow cores, a few fast cores, or a mix? . Clock gating – shut off clock Processor pipeline design – trade performance vs power – Removes dynamic power . More work per clock cycle = lower clock = lower . Power gating – shut off power power – Removes static power (leakage) . Trade top-end performance for lower power – 2x performance means far more than 2x power . Gating is applied to ever smaller parts of chip Use accelerators Power states: . Specialized accelerators use less power to do . Settings for frequency, voltage, and on/off the same computation than a more general processor . Units set to lowest possible state to save power Cache hierachy . Increasing number of units and number of steps . Cache hit = lower power than memory access
Computer Architecture - Uppsala - 2019-05-22 42 System Optimization
Overall system design Such as. Selection of component parameters . Display resolution and size . Battery size . Processor choice . Memory choice – LPDDR (Low-Power DDR) vs regular DDR . Slow down wireless functions to save power . Offloading functions to specialized accelerators . Cooling efficiency
Computer Architecture - Uppsala - 2019-05-22 43 Power Management Software
Given that we have done our best in architecture & silicon...
Probably the biggest lever we have today to improve power/performance is the power management software
Essentially, a control feedback loop implemented in hardware, firmware, and software – driving power states and gating
Current operating goals
Temperature sensors
Power sensors Firmware Drivers
Operating BIOS Power states, on/off system
Clock frequency settings Applications
Voltage regulation Power management-relevant software
Computer Architecture - Uppsala - 2019-05-22 44 Power Management Firmware and Software Tasks
Optimize performance Avoid disaster . Profile current load . Throttle to avoid drawing too much power from the platform . Determine best way to set controls – Each chip has a design limit . Balance power draw vs user experience . Throttle to avoid overheating the chip
Sleep & wake-up . Put system into deeper sleep . Power off and on units in the correct order, wait until operation is stable
Computer Architecture - Uppsala - 2019-05-22 45 Hardware Control Points & Sensors
Hardware continously adds more control points to reduce waste: . Per-core voltage and clock-frequency adjustments (used to be per chip) . More power states in more devices . Faster changes to power states (off->on, clock & voltage scaling) – Note that going to low power state is not free - takes time to power or clock back up to full speed, operations take more time to complete Sensors multiply across the chips and system . Power levels . Thermal levels – very important to avoid cooking the chip! All of which come together in a power management unit (or units)
Computer Architecture - Uppsala - 2019-05-22 46 Layered Optimization and Goal Setting
Application OS will ask power management hardware to go to certain states based on its idea of Operating the current load system . ACPI states: ”active”, ”sleeping”, etc., for Driver processor, devices, and global Main processor . Applications can give hints to the OS about what it wants from power control Firmware Power controller firmware Power management unit . Will make quick adjustments based on the state . Responsible for sequencing sleep, nap, hardware hibernate states
Computer Architecture - Uppsala - 2019-05-22 47 Note: ACPI Power States
ACPI (Advanced Configuration and Power Interface) defines sets of states . Basic OS & Driver interface to power management Applies to different system parts and levels . Package/Chip . Core . Devices . Links (such as PCIe)
Computer Architecture - Uppsala - 2019-05-22 48 This CPU name string comes from CPUID and the chip directly! Sensor: Current CPU temperature
Actuator: Core frequencies vary with the load
Control example: fan speed vs temperature: higher temp = rev up the fan to compensate Screen capture from my living room gaming PC, using MSI* Command Center, 2017-03-05
Computer Architecture - Uppsala - 2019-05-22 49 Example: Situation Changes Quickly
More cores activated and the clock frequency goes up = higher package temperature
Screen capture from my living room gaming PC, using Intel® Extreme Tuning Utility (XTU), 2017-03-27
Computer Architecture - Uppsala - 2019-05-22 50 Example: Avoiding Disaster
My old Sony* Android mobile phone Playing some YouTube* videos This happens when (I guess): . Screen is on . WiFi pulling in data . Processor & accelerators decompressing video streams at high resolution . Is overall a bit more than the package was designed to handle...
*Other names and brands may be claimed as the property of others
Computer Architecture - Uppsala - 2019-05-22 51 Power Management: Max is not Sum of all Max
35W Fictional example for illustration
5W 3W 5W 10W Total chip power allowed = 35W Processor Vector core Unit . Dictated by heat sink, power supply, and 5W 3W market segmentation Processor Vector Memory core Unit controller Total max power = 51W 15W 5W . Throttle one part of the chip to allow others to run at full speed Graphics unit L2 Cache IO Power management needs to keep the System-on-Chip power inside allowed bounds Hypothetical chip, rather simplified
Computer Architecture - Uppsala - 2019-05-22 53 Power Management: Set According to Workload
35W Compute-focus:
5W 3W 4W 10W . Power up cores, memory, and vectors Processor Vector core Unit . Throttle graphics to make room 5W 3W . Turn off IO, we assume we run from Processor Vector Memory core Unit controller memory 5W 0W
Graphics unit L2 Cache IO
System-on-Chip Hypothetical chip, rather simplified
Computer Architecture - Uppsala - 2019-05-22 54 Power Management: Set According to Workload
35W Gaming:
5W 0W 4W 10W . Graphic processing most important Processor Vector core Unit . Run one core at full speed – latency of 0W 0W processor work is important Processor Vector Memory core Unit controller . Forbid the use of vector units – assume 15W 1W that is all on the graphics unit . A bit of IO needed for sound and chat
Graphics unit L2 Cache IO . Memory controller also needs power
System-on-Chip Hypothetical chip, rather simplified
Computer Architecture - Uppsala - 2019-05-22 55 “Turbo” Processor Speed and Multicore
Processor speeds typically defined: . Base frequency . Max/turbo frequency When high performance is needed: . Use only a few cores = clock higher . Use many cores = clock lower . Using heavy units like AVX = lower
. Example graph for Intel® Xeon® https://www.anandtech.com/show/11544/intel-skylake-ep-vs-amd-epyc-7000-cpu-battle-of- Platinum 8180 28-core processor: the-decade/8
Computer Architecture - Uppsala - 2019-05-22 56 Summary
Power saving comes from silicon improvement, architecture improvements, system optimization, and power management Chips are full of sensors and actuators used by power management Power management is a nested dynamic feedback loop Broken power management can literally fry a chip
Computer Architecture - Uppsala - 2019-05-22 57 Wind RiverSimics® –System-Level Virtual Platform Computer Architecture - Uppsala - 2019-05-22 59 Hardware: A Hard Development Platform?
Computer Architecture - Uppsala - 2019-05-22 60 Hardware is Hard When it is in...
Not yet available Flaky prototype stage Not available anymore
Computer Architecture - Uppsala - 2019-05-22 61 Hardware is Hard When it is...
Inconveniently large & complex Dangerous to play with Inaccessible & expensive
Computer Architecture - Uppsala - 2019-05-22 62 Solution: Simulate the Hardware = Virtual Platform
Apps
OS
HW
Wind River® Simics®
Computer Architecture - Uppsala - 2019-05-22 63 About Wind River Simics® Wind River Simics® History
Development started in 1991 Major milestones . Spin-off from research project . 2.0: Heterogeneous systems Virtutech company founded in 1998 . 3.0: Reverse execution & debug, 2005 . 3.2: Intel VT-X acceleration . Sun & Ericsson first customers . 4.0: Multi-threaded (coarse), 2008 Acquired by Intel in 2010 . 4.2: Distribution, 2009 . Sales and marketing put into Wind River . 4.4: Eclipse GUI, 2010 . Large internal use & development at Intel . 4.6: TCF Debugger, 2012 Core development team still in Stockholm . 5: Multicore multithreading, 2015 . With local development teams around the . 6: More threading & integration, 2018 world, doing integration and modeling
Computer Architecture - Uppsala - 2019-05-22 65 How it Works
Full system virtual platform
Apps . Virtual/simulated target hardware User-level application code
. Run the same software as the physical OS Middleware and system libraries Important properties: HW Target operating system (s) . Fast enough to run real software workloads Virtual/simulated Network . Simulate any computer system target hardware Wind River® Simics® . Single board, multiple boards, standard parts, custom chips, IO, networks, … Host operating system Frees testing and development from the dependence on physical hardware Host hardware
Computer Architecture - Uppsala - 2019-05-22 66 Simulate any size System
Systems of systems
Racks, backplanes, cabinets, networks
Boards and single machines Chipsets Processor cores and SoCs
Computer Architecture - Uppsala - 2019-05-22 67 What’s the Point?
Fundamentally Simics is about running real software on virtual hardware in order to test & debug the software, the software-exposed aspects of the hardware, and the hardware design “Software” can mean many things… . Firmware, that is deeply hidden inside a chip . BIOS/Bootloader/UEFI, that is used to boot the machine . Device drivers, that manage hardware for an operating system . Operating systems . Middleware, providing services for other software . Applications, that any programmer would write . Distributed systems, software running across many separate machines . From bytes to terabytes of code!
Computer Architecture - Uppsala - 2019-05-22 68 Simulation as Tool
The power of Simics is to bring simulation to the domain of concrete software Build a model of reality, and then do experiments on the model Simulation as a design & development & research tool is well established . Explore things that cannot be done for . Quickly try different things very cheaply . Vary parameters over large spaces not possible with real hardware . Observe the internals of a system . Disturb/change the internal state of a system . Debug behaviors more efficiently thanks to better insight and control We make systems work better by allowing the software to be run on virtual hardware
Computer Architecture - Uppsala - 2019-05-22 69 Simics Level of Abstraction
Goal: Fast & scalable simulation Transaction-level modeling (TLM) Lazy and agile modeling
Build out platform from core to all over time A T B
A B Scope and speed and Scope
Detail of model Time
Goal: run the real software Model function & basic timing Add timing and µarch when needed
User application code System Processor Processor Cycle-accurate memory map Device register Cache model instruction simulators hardware Middleware and (not bus interface (timing) libraries set from designers models system)
Target operating system (s) Packet-level Event-driven Loose timing Processor models of simulation, not Power models Target model includes all software-visible model timing models functional aspects of hardware, such as networks cycle-driven processor instructions, supervisor modes, device registers, interrupts, etc.
Computer Architecture - Uppsala - 2019-05-22 70 Wind RiverSimics®use cases Wind River® Simics®: Throughout the Life Cycle
Bring-up and Design & Application Test and Deployment & platform Architecture development integration maintenance development
Product Timeline
Computer Architecture - Uppsala - 2019-05-22 72 Architecture: Example of a Hardware Block
Benchmark or typical user application Traffic generation
Linux Device driver Linux
Target machine Network traffic Detailed model of the Core Core RAM Disk generation inside or accelerator block: outside of Simics Microarchitecture, microengines, buses, APIC FLASH Ethernet etc. Firmware Network
USB Serial GPU Detailed model of an accelerator block Evaluate the performance of the block under real Simics target system model workloads Wind River®Simics®
Computer Architecture - Uppsala - 2019-05-22 73 “Shift Left” – Accelerating Product Development
Provide hardware models well in advance of RTL and silicon . Allow software development before silicon arrives – IP-block firmware, BIOS, drivers, ... . Shorten time to market by overlapping software and hardware design . Decouple hardware and software schedules for reduced risk . Validation of hardware, software, and their integration can start earlier ”Better products faster”
Computer Architecture - Uppsala - 2019-05-22 74 Ecosystem Enablement
Critical component – but not a product in itself
Product
Intel chips and chipsets Custom board
Board design Start early using Firmware dev virtual platform UEFI customization models – even at Device drivers OEM-level New HW features
Computer Architecture - Uppsala - 2019-05-22 75 Heterogeneous Integration Platform
Simics serves as a simulation platform, User program User program Middleware integrating all kinds of models Operating system
Hardware drivers UEFI/BIOS/Boot code
Firmware Device Simics Simics Other SystemC TLM RAM Flash Disk ISS ISS ISS system w/ ISS Subsystem Sensor Firmware TLM Model in Simics IO IO Subsystem with Python Xtor Xtor other framework DML Actuator internal ISS Entire chip Firmware Detailed Simics SystemC TLM Bus SystemC detailed architecture C/C++ device model model RTL Simulator, FPGA Environment model Simics heterogeneous target system model prototype, Big-box Emulator
Wind River® Simics®
Computer Architecture - Uppsala - 2019-05-22 76 Continuous Integration Developer Changes or Adds Code
Pre-CI Test Build System
Tests running mostly on simulation in order to: Unit Test • Do integration pre-si and post-si • Shorten test latency • Run each test more often Subsystem-Level Test • Run more and more varied configurations • Provide suitable configurations • Test what cannot be tested on hardware System-Level Test
Continuous OK Quality Delivery Assurance
Computer Architecture - Uppsala - 2019-05-22 77 System-Level Super Debugger
Insight into all Synchronous entire- Trace anything System-level symbolic components system stop debug
Unlimited powerful Record-replay debug Repeatability & Collaboration between
breakpoints Test Reverse debug developers
break –x 0x0000 length 0x1F00 Test
break-io uart0 Test Debug break-exception int13 Test break-log “spec violation” Test
Computer Architecture - Uppsala - 2019-05-22 78 More Information
Wind River® Simics® product: . http://www.windriver.com/products/simics/ My blog on simulation: . https://software.intel.com/en-us/meet-the- developers/evangelists/team/jakob-engblom My personal blog: . http://jakob.engbloms.se Intel Software makes other programming tools available for free to students: . https://software.intel.com/en-us/qualify-for-free-software/
Computer Architecture - Uppsala - 2019-05-22 79 Legal Disclaimers
• Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. • No computer system can be absolutely secure. • Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete http://www.intel.com/performance. Intel, the Intel logo, Xeon, Xeon Phi, Atom, Quark, Core, Pentium, 3D Xpoint, Optane are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © 2018 Intel Corporation
Computer Architecture - Uppsala - 2019-05-22 80