16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context Compute on µC 16bit Simulation with GNU Octave Approximation

An Example Approximation 16bit Arithmetic The Result Andreas Stahel In General Bern University of Applied Sciences Closing

OctConf 2017, March 20–22, 2017, CERN, Geneva PersonalI

16bit Simulation with GNU Octave

Andeas Stahel

Personal Context Andreas Stahel Compute on µC Mathematics Approximation

An Example Approximation 16bit Arithmetic Teaching: The Result Math at Bachelor level to mechanical and electrical engineers In General Numerical Methods for the Master Program of Biomedical Closing Engineering A class on how to use Octave to solve engineering problems As member of the Institute for Human Centered Engineering (HuCE): many industry projects in mathematical modeling Web: https://web.ti.bfh.ch/sha1/ E-mail: [email protected] PersonalII

16bit Simulation with GNU Octave

Andeas Stahel Concerning Octave: Personal Octave is used regularly for teaching, project work and research. Context Compute on µC I teach a class on how to use Octave for engineering problems. Approximation

An Example I started using Octave in 1993/94 and am addicted to it since. Approximation 16bit Arithmetic Octave replaces MATLAB for many reasons: open source, great The Result community support, platform independent, (legally) free. In General Closing My professional life would be different without Octave!

Thank you guys Why computing on a µC?

16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context Compute on µC Approximation

An Example Approximation 16bit Arithmetic The Result In General

Closing Some µC are very affordable and thus used in many devices, not visible by the user. Functions can useful to calibrate sensors, or one might do a first step of the data treatment on the µC based sensor. Developing on a true µC can be tedious. Using a desktop and the power of Octave is convenient. Facts for Computing on µCI

16bit Simulation with GNU Octave

Andeas Stahel

Personal Context On most affordable µC only integer arithmetic is implemented in Compute on µC hardware, i.e. no FPU. Approximation An Example Floating point libraries are large, slow and the results are often Approximation 16bit Arithmetic overly accurate. The Result In General If you use a 12bit AD converter, there is no need for a 32bit

Closing accuracy of the subsequent calculations. Since +, − and ∗ are available one can aim to implement polynomial functions. Facts for Computing on µCII

16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context Different approaches are possible to implement the evaluation of a Compute on µC given function. It is often a compromise between the amount or Approximation required storage and the computations needed. An Example Approximation 16bit Arithmetic more storage ←→ less storage The Result In General fewer computations ←→ more computations

Closing piecewise interpolation one global look up table linear quadratic polynomial Facts for Computing on µCIII

16bit Simulation with GNU Octave

Andeas Stahel On a typical 16bit µC the arithmetic operations for integers are Personal

Context implemented in harware. Compute on µC 16bit ± 16bit → 16bit Approximation 16bit ∗ 16bit → 32bit An Example Approximation 16 k 16bit Arithmetic Division by 2 is free (high double byte), division by 2 is cheap The Result (shift). In General

Closing Use full the ranges available for the data types int16 or int32 to obtain optimal accuracy. For multiplications we aim to use the full range of 32bit results, but will only use the high double byte for further computations. Approximation by Polynomials

16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context To approximate a given function on a bounded interval by a Compute on µC polynomial, different mathematical tools might be useful: Approximation

An Example Linear regression, i.e. least square approximation Approximation 16bit Arithmetic Chebyshev approximation The Result Optimization by using , based on maximum In General fminsearch()

Closing norm, or L2-norm, or . . . A combination of the above. For the problem at hand I worked with Chebyshev approximations. Chebyshev ApproximationI

16bit Simulation with GNU Octave The Chebyshev polynomials on the interval [−1 , +1] are given by Andeas Stahel

Personal Tn(x) = cos(n arccos(x)) Context Compute on µC T0(x) = cos(0) = 1 Approximation T1(x) = cos(arccos(x)) = x An Example 2 Approximation T2(x) = 2 x − 1 16bit Arithmetic 2 The Result T3(x) = 4 x − 3 x In General T (x) = 8 x 4 − 8 x 2 + 1 Closing 4 . .

A recursive identity allows to determine the polynomials efficiently.

Tn+1(x) = 2 x · Tn(x) − Tn−1(x) Chebyshev ApproximationII

16bit Simulation with GNU Octave Andeas Stahel A function defined on [−1 , +1] is approximated by a polynomial Personal pN (x) of degree N. Context Compute on µC Z 1 Approximation 2 1 cn = f (x) Tn(x) √ dx 2 An Example π −1 1 − x Approximation 16bit Arithmetic N The Result c0 X f (x) ≈ pN (x) = + cn Tn(x) In General 2 n=1 Closing This is easily implemented in Octave.

If a function g(z) is defined on [a , b] then move it to the standard b−a interval [−1 , +1] by f (x) = g(a + (x + 1) · 2 ) Approximate arctan(x) by a Parabola

16bit Simulation with GNU Octave The above Chebyshev approximation can be used to approximate Andeas Stahel the function f (x) = arctan((1 + x)/2) by a parabola.

Personal 2 Context f (x) ≈ p2(x) = −0.0709107 · x + 0.394737 · x + 0.4625339 Compute on µC Approximation = (−0.0709107 · x + 0.394737) · x + 0.4625339

An Example Approximation The relative error of this approximation p2(x) can be determined 16bit Arithmetic The Result in bits, use log2(), leading to 7.97 ≈ 8 correct bits. In General Function and Chebyshev approximation Error of Chebyshev approximation Closing 0.8 0.003

0.002 0.6 0.001

0.4 0 y y

0.2 -0.001 -0.002 0 -0.003

-0.2 -0.004 -1 -0.5 0 0.5 1 -1 -0.5 0 0.5 1 x x Setup for the 16bit Computation

16bit Simulation with GNU To determine the 16bit values of Octave

Andeas Stahel p2(x) = (−0.0709107 · x + 0.394737) · x + 0.4625339 Personal Context aim for vector y16 and a factor yscale such that Compute on µC Approximation

An Example yscale · y16 ≈ p2(x) Approximation 16bit Arithmetic The Result The goal is to implement this evaluation with a 16bit arithmetic, In General while keeping the result as accurate as possible and avoiding 15 Closing overflow, i.e. results exceeding ±2 . Start with a fine grid of x-values, e.g. x=linspace(-1,1,100000); Since −1 ≤ x ≤ 1 we multiply x by xscale and convert to int16 with x16 ≈ xscale · x, such that

−MaxVal ≤ x16 ≤ +MaxVal = 215 − 1 = 32767

With this we use the full accuracy available on a 16bit arithmetic. Step 1: res1 = −0.070910677 · x + 0.3947365I

16bit Simulation with GNU Octave To perform the first Horner step proceed as follows: Andeas Stahel

Personal y = −0.070910677 · x + 0.3947365 32767 Context y16 = -32767 (use full scale) yscale = 0.070910677 Compute on µC 16 yscale·xscale Approximation prod16 = y16 * x16/2 yscale = 216 k An Example if |prod16 + yscale · 0.395| > 32767 rescale, divide by 2 Approximation add16 = int16(yscale * 0.3947365) integer to be added 16bit Arithmetic The Result y16 = prod16 + add16 In General Closing With the above numbers rescaling by 1/4 is required and leads to add16 = 22800 . The result y16 satisfies

yscale · y16 ≈ −0.070910677 ∗ x + 0.3947365 Step 1: res1 = −0.070910677 · x + 0.3947365II

16bit Simulation with GNU Octave Andeas Stahel The result can be visualized, using exact (double precision) and

Personal approximate (16bit) computations.

Context Compute on µC Result of step 1 40000 Approximation true 16bit An Example Approximation 20000 16bit Arithmetic The Result

In General 0

Closing y scaled

-20000

-40000 -1 -0.5 0 0.5 1 x Step 2: res2 = res1 · x + 0.4625339I

16bit Simulation with GNU Octave Andeas Stahel To perform the second Horner step proceed as follows: Personal

Context y = res1 · x + 0.4625339 Compute on µC 16 yscale·xscale Approximation prod16 = y16 * x16/2 yscale = 216 k An Example if |prod16 + yscale · 0.4625| > 32767 rescale, divide by 2 Approximation add16 = int16(yscale * 0.4625339) integer to be added 16bit Arithmetic The Result y16 = prod16 + add16 In General Closing With the above numbers no rescaling is required, thus add16 = 13357 The result y16 satisfies

yscale · y16 ≈ p2(x) Step 2: res2 = res1 · x + 0.4625339II

16bit Simulation with GNU Octave The result can be visualized, exact (double precision) and Andeas Stahel approximate (16bit) computations. Personal

Context Result of step 2 40000 Compute on µC true Approximation 16bit

An Example 20000 Approximation 16bit Arithmetic The Result 0 In General y scaled Closing -20000

-40000 -1 -0.5 0 0.5 1 x

1+x This is an approximation of the function arctan( 2 ). The Result

16bit Simulation with GNU Octave To examine the quality graph the difference of true function and its Andeas Stahel 16bit approximation. The accuracy is given by Personal 7.96 bit for difference to the arctan-function Context Compute on µC 13.6 bit for the difference to the polynomial p2(x) Approximation

An Example The error is dominated by the Chebyshev approximation. Approximation 16bit Arithmetic Difference to 16bit approximation Difference to 16bit approximation The Result 0.003 -0.0012 In General 0.002 Closing 0.001 -0.0014

0 -0.0016 -0.001 difference difference -0.002 -0.0018 -0.003

-0.004 -1 -0.5 0 0.5 1 0.05 0.1 0.15 0.2 0.25 x x The Resulting ++ Code

16bit Simulation with GNU Octave

Andeas Stahel #include Personal DEFUN DLD (atan32,args ,nargout ,... Context ”atan with int16 arithmetic”) Compute on µC // for x = z∗32767 and −1 <= z <=1 the value of Approximation // y = 28878.761∗ arctan((z+1)/2) will be computed An Example Approximation { 16bit Arithmetic The Result static int i0 = −32767; In General static int i1 = +22800;

Closing static int i2 = +13357; int x = args(0).int v a l u e ( ) ; int ; r = i1 +(( i0 ∗x)>>18); r = i2 +((x∗ r )>>16); return o c t a v e v a l u e list (octave value ( r ) ) ; } In GeneralI

16bit Simulation with GNU Octave

Andeas Stahel

Personal Context The above Horner steps can be implemented in an Octave Compute on µC Approximation function. Examine the code HornerStep.m. An Example The Chebyshev approximation can be of higher order, leading to Approximation 16bit Arithmetic more accuracy and more computational effort. The Result In General There is no need for manual intervention. One can pack all of

Closing the above in an Octave script. Examine the code atan16.m. Using the code in atan16.m for a Chebyshev approximation of order 4 leads to smaller errors. In GeneralII

16bit Simulation For the approximation by p4(x) of degree 4 we obtain an accuracy of with GNU Octave Andeas Stahel 12.3 bit for difference to the arctan-function 14.1 bit for the difference to the polynomial p (x) Personal 4

Context Compute on µC The error contributions by the Chebyshev approximation and the Approximation 16bit arithmetic are of the same magnitude.

An Example Approximation Difference to 16bit approximation 16bit Arithmetic 0.00015 The Result 0.0001 In General

Closing 5e-05

0

-5e-05 difference -0.0001

-0.00015

-0.0002 -1 -0.5 0 0.5 1 x A Fast Division in Hardware

16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context The above technique was used to develop a fast hardware algorithm Compute on µC to divide numbers. Approximation

An Example Title: An Efficient Hardware Implementation for a Reciprocal Approximation Unit 16bit Arithmetic The Result Authors: A. Habegger, A. Stahel, J G¨otte,M. Jacomet all Bern In General University of applied Sciences Closing DELTA 2010: 5th IEEE Symposium on Electronics Design, Test & Applications What can I give to the community?

16bit Simulation with GNU Octave All of the above would not be possible without the help of the great Andeas Stahel Octave community. Personal Context Thank you guys Compute on µC Approximation It is only fair that I try to contribute too. An Example Approximation Find the lecture notes, codes and data on my web page 16bit Arithmetic The Result web.ti.bfh.ch/˜sha1 in the frame Octave, search for the file In General OctaveAtBFH.pdf. Or use the direct link Closing web.ti.bfh.ch/˜sha1/Labs/PWF/Documentation/OctaveAtBFH.pdf For a class on statistics I put together a collection of commands in web.ti.bfh.ch/˜sha1/StatisticsWithMatlabOctave.pdf . On a few occasions I reported bugs or contributed some code to Octave and its packages1.

1The help and support you get from the community is amazing and beats any tech support from commercial companies I deal with! 16bit Simulation with GNU Octave

16bit Simulation with GNU Octave

Andeas Stahel

Personal

Context Compute on µC That’s all folks Approximation

An Example Approximation Thank you for your attention 16bit Arithmetic The Result In General

Closing

Slides and codes are available at web.ti.bfh.ch/˜sha1/Octave/OctConf2017/