<<

Efficient AI Inference at the Edge with Imagination IP

April 2018 www.imgtec.com Imagination: A Global Technology Leader A technology powerhouse for multimedia, communications and AI IP An industry leader since 1985 Developing innovative IP . Leader in embedded Graphics and GPU compute IP . Leading the advance in dedicated Vision and AI IP . Leader in RPU communications IP

Delivering exceptional service . Enabling very fast time to market . Enabling customers to leverage IP to maximise differentiation Driving major markets . Helping our partners to create successful solutions . Influencing new and emerging opportunities . Showcasing and proving our technology with real products

© 2 Company

Awarded by ,Apple Over 11B Acquired by Listed in London Blair(PM of UK) invest Acquire MIPS shipment Canyon Bridge

1985 1994/7 1995-1998 1999 2002 2006 2013 2017/5 2017/11

Set up Computer Intel and 1B GPU Spin off 3D GPU other shipment MIPS MP with giants NEC license

© Imagination Technologies 3 Business Model

Licensees

OEMs and ODMs

Consumers

Our royalty based business model means we only succeed if our customers succeed

© Imagination Technologies 4 Licensees and Partners

Key Licensees

Strategic Partners

5GIC

© Imagination Technologies 5 Neural Network Landscape & Growth Opportunity Large variety of Edge AI Applications across many different Markets IoT Natural speech Gesture Analysis synthesis Face recognition Speech Age/gender recognition understanding Medical diagnostics AR/VR Mobile Feature description Gesture Analysis Natural speech Face recognition Movement Analysis Eye Tracking synthesis Age/gender Feature detection Scene Understanding Speech understanding recognition Gesture Analysis Medical diagnostics

Drone Industrial Collision avoidance Defect detection Subject tracking Object detection Camera optimisation Context awareness Gesture recognition

Automotive Security Driver alertness monitoring Drivable Path Analysis Face detection/recognition Crowd/loitering detection Driver gaze tracking Road User Detection People counting Hazard detection Seat Occupancy Driver Recognition Doorway surveillance Multi-camera tracking Road Sign Detection

© Imagination Technologies 6 Imagination Product Portfolio Imagination The best solution for embedded graphics, vision, AI and communications

PowerVR GPU PowerVR Vision & AI Ensigma Leading graphics IP Dedicated AI and Connectivity and broadcast cores for embedded Computer Vision IP communications devices Products High performance, low power

XE/XM GPU XT GPU PowerVR 2NX PowerVR ISP RF Connectivity Broadcast Focused Features Feature rich Neural Network Low area, high /mm2 Accelerator quality & highly Wi-Fi, Bluetooth Wi-Fi, Bluetooth TV, Digital Radio power efficient & Performance/mm2 IEEE 802.15.4 2 Performance/mW Performance/mm Performance/mW Multi camera capable ISP GNSS

© Imagination Technologies 7 Edge Neural Network Processing Resources Why GPU and Neural Network Accelerators are Best

CPU DSP GPU Neural Network Fixed Function Accelerator • Fully Flexible • Fully Flexible • Fully Flexible • Configurable • Single usage case • BUT inefficient and • BUT hard to • Standardised APIs • Lowest power with • Lowest power BUT slow for high program – no for Compute, Float domain specific zero flexibility compute standardisation, and INT support flexibility workloads INT focussed

Efficiency sweetspot

© Imagination Technologies 8 PowerVR NNA Leading Position Only a dedicated hardware solution offers required PPA for long battery life

- PowerVR 2NX NNA PowerVR NNA - Other solutions Area of circles represents PowerVR NN area – smaller = better Specific Accelerator Use with existing programmable VPU+HW cores if needed

DSP+HW Accelerators Flexibility Costs DSP+ PPA Efficiency HW Improve Power Efficiency (Inference/W) Efficiency Power CPU/GPU/DSP DSP Efficiency Competitor CPU GPU

0 500 1000 1500 2000 2500 Performance (MACs/Clock)

© Imagination Technologies 9 PowerVR 2NX NNA – Architecture and Features IP core to enable efficient inferencing of neural networks in SoCs at the “edge”

. Scalable architecture Host Control DDR . 2048 8-bit MACs/clock equals 4Tops/core . Multi-core scalable achieves beyond performance Bus Interface

. Architected for creation of future cores with different performance points NN Compute Core and feature sets to address multiple markets and applications (Convolutions) Output Weights NN Compute Formatter . Flexible bit-depth data type support Processing Engine . 16, 12, 10, 8, 7, 6, 5, 4-bit support – covering different markets taking advantage of the benefits of lower precision Accumulation Buffer . Per layer adjustment for both weights and activations

Element . Maximum performance at minimum power and bandwidth Activation Engine Normalise Pool . Variable precision internal data formats Shared Buffer . High precision, where required, inside the accumulator . Quantisation for following layers for efficient power/processing efficiency . Ensures optimum accuracy for results

© Imagination Technologies 10 Research on Low bit Benefit PowerVR 2NX – precision flexibility for optimised performance, power and bandwidth

Only fully Relative Relative Relative Relative PowerVR precision flexibility connected inference/s bandwidth energy/ accuracy enables 60% performance weights (bits) inference increase, at 54% of the bandwidth, 69% of the power 8 100% 100% 100% 100% with less than 1% drop in 4 160% 54% 69% 99% accuracy

% Bandwidth relative to PowerVR 2NX 400 PowerVR precision flexibility 300 means requiring as little as 200 25% of the bandwidth 100 compared with competing 0 solutions PowerVR NX HW competitior DSP NNA Competitior

© Imagination Technologies 11 Flexibility on Low-bit Maximises performance, maintains accuracy, minimises power and bandwidth

. PowerVR 2NX NNA supports PowerVR 2NX Example implementation of variable weights and precisions variable (including low)

Weight Data precisions for data and weights

8 4 4

- - -

bit bit

· · · bit . High Internal precision

8 -

bit maintains network accuracy

7 6 5

- - -

bit bit bit . Allows higher performance at lower bandwidth and power … Input . Configurable output format SoftMax enables CPU/GPU/DSP compatibility Activations . PowerVR 2NX is the only 8-bit 7-bit 6-bit 5-bit 4bit 6-bit · · · FP32 solution on the market with this level of flexibility – unique benefit

© Imagination Technologies 12 Easy Going on PVR Platform It’s as easy as 1 .. 2 .. 3 - PowerVR 2NX workflow

1 Network Model Tensorflow Caffe Trained Model Network Training Large Training …. Data

2 PowerVR Subset of Optimised Tuned Model (Optional) Network Training Data Framework Optimisation

PowerVR 3 PowerVR 2NX NX Control/Data for NNA mapping and Accelerator Mapping PowerVR NX Configuration Tool Driver Implementation

© Imagination Technologies 13 Performance & Bandwidth(compiler) It’s as easy as 1 .. 2 .. 3 - PowerVR 2NX workflow

Network Model 1 FloatingTensorflow point Caffe Trained Model Training Large Training 32-….bits Data

Trained Model 2 PowerVR QuantisationTuning Tuned Model Tuning Subset of 4 to Tool16 bits (Optional) Training Data

3 (Tuned) ModelOptimal usagePowerVR of Hardware Mapping PowerVR 2NX Mapping PowerVR(Bandwidth, 2NX PerformanceTool Tuning)Control Data Configuration

© Imagination Technologies 14 Cross Platform API Training, network types and models, frameworks, inference engines, APIs … Application developer, network designers and solution providers

Network type and model Platform ML framework APIs CNN MLP Mobile OpenVX GoogleNet . Android nn AlexNet . Tensorflow Lite Caffe2go Convert/Optimise PowerVR DNN RNN/LSTM SSD API . . General Purpose Translation and . . Caffe optimisation Tensorflow Khronos NNEF Inference engine Caffe2 CPU PowerVR DSP Training data Tools/API VPU Text Research PowerVR GPU Labelled images PyTorch Audio PowerVR 2NX “Labelled data” NN Processor

“Offline” - Training “Online” - Inferencing

© Imagination Technologies 15 Complete Imagination AI System PowerVR GPU, PowerVR NNA (Neural Network Accelerator)

On Chip Memory PowerVR GPU ALU Cluster

TA Scheduler ISP

On Chip Memory Compute Data Master ALU Cluster

MMU/System MMU/System System Level Memory Memory Interface Cache Interface

System Memory Bus

On Chip Memory PowerVR GPU PowerVR Neural Network accelerator • Optional • Reduces bandwidth of Neural • Reuse existing hardware (if applicable) • Dedicated hardware Network Processing • Fast performance • Fastest performance • Can be used by other system • No extra silicon area (vs. graphics) • Smallest silicon area • Can run all layer types • Lowest power and bandwidth components and other usage rd scenarios e.g. GPU Graphics • Low power • Works with PowerVR GPU or 3 Party IPs

© Imagination Technologies 16 PowerVR the Best for Business PowerVR 2NX designed for mobile and Android

. PowerVR 2NX is the only IP solution in the market that can PowerVR deliver against all the requirements for a deployable mobile 9XE/9XM solution NNA GPU . Low area for PowerVR 2NX combined with the low area of PowerVR the PowerVR 9XE GPU provides a GPU+NNA solution in the same footprint as a competing GPU alone

. Requirements met with PowerVR 2NX area Similar total Similar . Low power – full hardware ensures lowest power/inference . Low area – most efficient solution in terms of inferences/mm2 . MMU – easy integration in SoCs supporting high level OSs Competing GPU . Android support - PowerVR has long history of Android support

© Imagination Technologies 17 Thank you

www.imgtec.com