Programming For the 22nd Century: We are not programming in 1984 (or 1969) anymore

by Jon "maddog" Hall Executive Director International and President, Project Cauã

1 of 52 Copyright Linux International 2016 Who Am I?

● Half Electrical Engineer, Half Business, Half Computer Software

● In the computer industry since 1969 – Mainframes 5 years – Unix since 1980 – Linux since 1994

● Companies (mostly large): Aetna Life and Casualty, Bell Labs, Digital Equipment Corporation, SGI, IBM,

● Programmer, Systems Administrator, Systems Engineer, Product Manager, Technical Marketing Manager, University Educator, Author, Businessperson, Consultant

● Taught OS design and compiler design

● Extremely large systems to extremely small ones

● Pragmatic

● Vendor and a customer 2 of 52 Copyright Linux International 2016 Who Am I? Linux ● May 1994 – funded Linus Torvalds at DECUS – Obtained Alpha for Linus to do port

● CISC/RISC ● 32/64 bit – Assembled DEC engineering team

● 1995 – Assumed ED role for Linux International – LMI – defended Linux Trademark – LPI – created SysAdmin certification program – LSB – Linux Standard Base

● Promote Linux and FOSSH Worldwide

● July 2016 - Chair of Board of Directors for Linux Professional Institute (lpi.org) – Founding member

3 of 52 Copyright Linux International 2016 My “Second Language”

● PDP-8 Assembler – 4000 12-bit words – Data and Instructions same length ● From a book – Plenty of practice

Nobody told me “assembly was difficult”

4 of 52 Copyright Linux International 2016 Examples From My Past ● Compiler errors (?!?)

● IBM – “Read Clock” - 99% of wall clock time – EBCDIC – the four slowest instructions ● Cache – Digital Unix – ½ the size, 7% faster – 40 times (David Mossberger-Tang) – 220 times – PDP-11/70 with RSTS-E ● Tapes (OMG!) - start/stop and streaming tapes

5 of 52 Copyright Linux International 2016 Who Needs Performance?

● “CPUs are fast enough”

● “JAVA is the only language people need”

● “Nobody codes in assembler language any more”

● “Virtual machines make architecture knowledge obsolete”

6 of 52 Copyright Linux International 2016 Performance: Changing Definitions ● “Real” problems – Petabytes of data, thousands of processors ● Real-time – REAL real-time

● Lower those rods! – Linus and “soft real time” ● Cell Phone Apps – Saving battery life ● Saving the environment!

● “New” (or at least newly affordable) advancements (FPGAs, DSPs) 7 of 52 Copyright Linux International 2016 FPGAs: A Different Type of Computation ● Field Programmable Gate Array – An array of transistors (gates) – Can be (re) configured (or (re) programmed) in “Field” by programmers or other CPUs

● Radiation damage? – Not necessarily clock driven ● ICEstick Evaluation 'kit' – 21.95 USD

8 of 52 Copyright Linux International 2016 To Write Really Great Code...

...you need to understand hardware architecture and that includes machine/assembly language

9 of 52 Copyright Linux International 2016 Today

● College Students learning – “Microsoft Office and Oracle” instead of “Office Systems and Databases” – JAVA and “IT/TI” instead of Assembly and Operating Systems – Virtual machines instead of “real iron” ● High school students – Games and HTML

10 of 52 Copyright Linux International 2016 How Most High School Students See Computers

11 of 52 Copyright Linux International 2016 How Computers Really Look

12 of 52 Copyright Linux International 2016 Real Life Effects of “Dumb Down”

“Incoming freshmen know less than they knew 20 years ago” - Foundation

The Raspberry Pi was created to help fix this problem.

13 of 52 Copyright Linux International 2016 ● Microcontroller – “Slow” clock

● “Small memory”

● No – Single program – Hard Real Time

● IDE Programming

● “Open” Hardware

● Large variety of “shields”

14 of 52 Copyright Linux International 2016 Today Is “pi rounded” Day

● pi = ~3.14.159.....

● pi “rounded” = 3.1416 th ● Today is March 14 , 2016

● Written as 03-14-16 = “pi rounded”

15 of 52 Copyright Linux International 2016 University of Cambridge: 2012 - Raspberry Pi – 35 USD ● Single Core ARM – 700Mhz

● ½ Gbyte Memory

● 3D GPU – Hardware video decode ● USB 2.0 – 10/100 Ethernet ● HDMI – Analog AV also ● GPIO Pins

● 3W 16 of 52 Copyright Linux International 2016 – 44 USD ● Dual Core ARM – 1000 Mhz

● Gbyte RAM

● 3D GPU - Hardware video decode

● USB 2.0 – 10/100/1000 Ethernet

● HDMI – Analog AV also

● GPIO Pins

● SATA

● IR receiver/transmitter

● 3W – better power management 17 of 52 Copyright Linux International 2016 Raspberry Pi 2 Model B 35 USD!

● ARMv7 Quad Core (85% improvement single-core performance, up to 7.5x parallel performance improvement)

● 1GByte RAM

● HDMI

● Audio out

● Gbit ETHERNET

● Micro SD card

● Physical as RPI B+ Remember GNU/Linux does a lot in parallel.... 18 of 52 Copyright Linux International 2016 LeMaker Banana PRO – 55 USD ● Dual Core ARM – 1000 Mhz

● Gbyte RAM

● 3D GPU - Hardware video decode

● USB 2.0

● HDMI – Analog AV also

● GPIO Pins compatible with RPi Model B+

● SATA 2.0

● IR receiver/transmitter

● 10/100/1000 Mbit/sec ETHERNET

● WiFi (802.11 b/g/n) and (optionally)

● 10W

19 of 52 Copyright Linux International 2016 Raspberry Pi Zero – 5 USD

● Same as RPi #1 – but – 40% faster – u-USB – u-HDMI – No GPIO headers ● Remember:

● NO networking

● 512 Mbytes RAM

20 of 52 Copyright Linux International 2016 New Raspberry Pi 3 Model B: after four years still 35 USD! ● ARM-64 Quad Core at 1.2GHz

● Broadcom VideoCore IV

● 1GByte RAM

● HDMI

● 4 x USB 2.0

● 10/100 ETHERNET

● Micro SD card

● Analog audio-video jack

● WiFi b/g/n

● BLE 4.1

● Camera (CSI) and LCD (DSI)

● 40 pin GPIO populated header

● Physical as RPI B+ 21 of 52 Copyright Linux International 2016 LeMaker Guitar: The COM Board

23 of 52 Copyright Linux International 2016 LeMaker Guitar: One of Four Motherboards

24 of 52 Copyright Linux International 2016 Who Are ARM and Linaro?

● ARM is a corporation that designs CPUs, GPUs and SoCs

● ARM licenses their designs and tools to companies like Broadcom, Samsung and others

● ARM licensees produce these chips and port GNU/Linux to them

● Linaro is an association of these companies that collaborate together to make sure GNU/Linux works well on the ARM architecture and chips

25 of 52 Copyright Linux International 2016 Linaro 96boards.org ● Two specifications for board design – “Consumer” < 100 USD – “Enterprise” < 300 USD ● 32-bit and 64-bit ARM processors

● Specifies – board layout – connector layout – Power suggestions ● Open to all vendors

26 of 52 Copyright Linux International 2016 Combining Arduino and 96Boards

27 of 52 Copyright Linux International 2016 Many Little Computers: 45 USD – 199 USD

BeagleBoneBlack Hackberry 10 ODROID-U3

OlimoX - LIME Pandaboard Galileo

28 of 52 Copyright Linux International 2016 96Boards.org Open Specification For Boards

● Standard for ARM SBCs – 32 and 64 bit – “Consumer” <100 USD – “Enterprise < 300 USD ● Placement of – Mounting holes – Connectors – Power supplies 29 of 52 Copyright Linux International 2016 BBC's Micro-Bit Given Away to Grade 7 (11-12 YO) Beginning Fall of 2015 ● CPU

● Micro-USB power

● 5x5 LED display

● Bluetooth LE

● Accelerometer

● Other measuring devices – Final specifications not ready ● “Program from their smartphones”

30 of 52 Copyright Linux International 2016 Why Do I Show You All This?

31 of 52 Copyright Linux International 2016 Because Of Things Like THIS! ● 12 ARMv7 Cores at 1 GHz each

● 6 GBytes of RAM

● 6 HDMI ports

● 6 SATA ports (currently driving two disks)

● IR on board

● 2 TB SATA disk

● 8 Port Gbit ETHERNET

● 70 Watts

● Fits in standard briefcase

32 of 52 Copyright Linux International 2016 Why Is This System Interesting?

● Can be used to teach HPC computing

● Can be used to teach HA computing

● Can be used to teach heterogeneous computing

● Can be used to teach heterogeneous systems administration

● Very portable, can be assembled in minutes

● Very modular

● Prototype cost: 500 USD – Currently using “Banana Pi”

● Production cost: < 400 USD – Will use (4) new “Raspberry Pi 2 Model B” – Will increase from 12 to 20 ARMv7 cores

33 of 52 Copyright Linux International 2016 yacc(1): Yet Another Computer Contest

36 of 52 Copyright Linux International 2016 GNU/Linux: 30+ Years

● Memories have gotten larger

● CPUs have gotten faster

● CPUs are multi-core

● Algorithms have changed

● Pipelining and cache more prevalent

● Optimizations in compilers have gotten better

● Need for assembly language decreased (and often is detrimental) Can we make the code better? 37 of 52 Copyright Linux International 2016 Announcing: “maddog and Linaro's GNU/Linux Optimization Contest” Optimize GNU/Linux modules

● Measure performance on platform

● Optimize code

● Measure performance after optimization

● Document performance improvements – Compiler switches used – Algorithm changes – Assembly code eliminated

38 of 52 Copyright Linux International 2016 Goals Of Contest

● Make sure modules of GNU/Linux work well on ARM-64 processors – Compile and test

● Best performance options on gcc in make file? – Insert proper ARM-64 assembly if needed – Work to liminate need for assembly language ● Increase performance of GNU/Linux for all architectures

● Create material for Li[bv]re course in software performance techniques.

39 of 52 Copyright Linux International 2016 Categories Of Performance

● % Speedup

● Memory utilization

● Cache utilization

● Best algorithm replacement

● Compiler Intrinsic creation

● Compiler performance improvements

● More categories?

What do we know/have today that we did not know/have 20-30 years ago?

40 of 52 Copyright Linux International 2016 Suggested Prizes

● Awesome Linaro golf shirt

● ARM development systems

● Free admission to Connect

● Travel to Connect meetings to present work – Asia – USA

41 of 52 Copyright Linux International 2016 Side Effects

● Learn really cool assembly language

● Learn code analysis techniques

● Create better code overall

● Open Source Portfolio

● Possible university level course in optimization

● Research into new optimization techniques – Automatic detection of race conditions?

42 of 52 Copyright Linux International 2016 Mentors

● College Professors

● Engineers

43 of 52 Copyright Linux International 2016 Examples of Contest Input

● What was the performance level increase over- all?

● What were the performance levels of the module before you started your work? After your work was competed?

● What tool chain did you use to create the patch? – What versions of the compiler and OS? – What options on the compiler? ● What tests did you perform to test the patch? – Please provide sample input

44 of 52 Copyright Linux International 2016 Resources

nd ● “The Definitive Guide to GCC: 2 Edition” by William von Hagen (Apress, 2006)

● “The Art of Debugging with GDB, DDD, and Eclipse” by Matloff and Salzmann (No Starch Press, 2008)

● “Valgrind 3.3: Advanced Debugging and Profiling for GNU/Linux applications” by Seward, Nethercote et. al. (Network Theory Ltd., 2008)

46 of 52 Copyright Linux International 2016 More Resources

● “ARM Assembly Language – an Introduction” by J.R. Gibson (Lulu, 2007)

● 96boards.org – low(er) cost 64-bit ARM systems

47 of 52 Copyright Linux International 2016 Time Frame ● Create examples of contest input and optimization techniques – In process

● Search for university “home” – University of Cambridge – University of Sao Paulo – Unicamp? PUC? – Other Universities

48 of 52 Copyright Linux International 2016 What Should You Do Now?

● Register at performance.linaro.org

● Read information about – Contest – Performance ● Learn how to use QEMU and ARM Models if you do not have ARM-64 hardware

● Pick module to investigate for porting/performance

49 of 52 Copyright Linux International 2016 Questions, Comments, Ideas?

[email protected]

performance.linaro.org

52 of 52 Copyright Linux International 2016