7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium This is your last free story this month. Sign up and get an extra one for free. Is COBOL holding you hostage with Math? Marianne Bellotti Follow Jul 28, 2018 · 12 min read Face it: nobody likes fractions, not even computers. When we talk about COBOL the first question on everyone’s mind is always Why are we still using it in so many critical places? Banks are still running COBOL, close to 7% of the GDP is dependent on COBOL in the form of payments from the Centers for Medicare & Medicaid Services, The IRS famously still uses COBOL, airlines still use COBOL (Adam Fletcher dropped my favorite fun fact on this topic in his Systems We Love talk: the reservation number on your ticket used to be just a pointer), lots of critical infrastructure both in the private and public sector still runs on COBOL. Why? The traditional answer is deeply cynical. Organizations are lazy, incompetent, stupid. They are cheap: unwilling to invest the money needed upfront to rewrite the whole system in something modern. Overall we assume that the reason so much of civil society runs on COBOL is a combination of inertia and shortsightedness. And certainly there is a little truth there. Rewriting a mass of spaghetti code is no small task. It is expensive. It is difficult. And if the existing software seems to be working fine there might be little incentive to invest in the project. But back when I was working with the IRS the old COBOL developers used to tell me: “We tried to rewrite the code in Java and Java couldn’t do the calculations right.” This was a very strange concept to me. So much so that a panicked thought popped immediately into my head: “my God the IRS has been rounding up everyone’s tax bill for https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b 1/12 7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium 50 years!!!” I simply could not believe that COBOL could beat Java at the type of math the IRS needed. After all, they weren’t launching men into space over at New Carrollton. One of the fun side effects of my summer learning COBOL is that I’m beginning to understand that it’s not that Java can’t do math correctly, it’s how Java does math correctly. And when you understand how Java does math and how COBOL does the same math, you begin to understand why it’s so difficult for many industries to move away from their legacy. Where’s Your Point? I took a little break from writing about COBOL to write about the ways computers stored information before binary became the de facto standard (and also a primer on how to use the z/OS interface but that’s neither here nor there). Turns out that was a useful digression when considering this problem. In that post I talked about various ways one might use on/off states to store base 2 numbers, base 3 numbers, base 10 numbers, negative numbers and so on. The one thing I left out was … How do we store decimals? If you were designing your own binary computer you might start off by just sticking with a base 2 representation. Bits left of the point represent 1,2,4,8… and bits right of the point represent 1/2, 1/4, 1/8… 2.75 in binary The problem is figuring out how to store the decimal point — or actually I should say the binary point because this is base two after all. This topic is not obscure, so you might realize that I’m referring to floating point -vs- fixed point. In floating point the binary point can be placed anywhere (it can float) with its exact location stored as an exponent. Floating the point gives you a wider range of numbers you can store. You can move the decimal point all the way to the back of the number and devote all the bits to integer values representing very large numbers, or you could move it all the way to the front and represent very small numbers. But you sacrifice precision in exchange. Take another https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b 2/12 7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium look at the binary representation of 2.75 above. Going from four to eight is a much longer jump than going from one-fourth to one-eight. It might be easier to visualize if we wrote it out like this: Don’t measure the gaps, I just eyeballed this to demonstrate the concept. It’s easy to calculate the difference yourself: the distance between 1/16 and 1/32 is 0.03125 but the distance between 1/2 and 1/4 is .25. Why does this matter? With integers it really doesn’t, some combination of bits can fill in the gaps, but with fractions things can and do fall through the gaps making it impossible for the exact number to be represented in binary. The classic example of this is .1 (one-tenth). How do we represent this in binary? 2-¹ is 1/2 or .5 which is too large. 1/16 is .0625 which is too small. 1/16 + 1/32 gets us closer (0.09375) but 1/16+1/32+1/64 knocks us over with 0.109375. If you’re thinking to yourself this could go on forever: yes, that’s exactly what it does. Okay, you say to yourself, why don’t we just store it the same way we store the number 1? We can store the number 1 without problems, let’s just take out the decimal point and store everything as an integer. Which is a great solution to this problem except that it requires you to fix the decimal/binary point at a specific place. Otherwise 10.00001 and 100000.1 become the same number. But with the places right of the point fixed at two we’d round off 10.00001 to 10.00 and 100000.1 would become 100000.10 Voilà! And that’s fixed point. Another thing you can do more easily with fixed point? Bring back our good friend Binary Coded Decimal. FYI this is how the majority of scientific and graphing calculators work, things that you really want to be able to get math right. https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b 3/12 7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium Remember me? BCD baby~ Muller’s Recurrence Fixed point is thought to be more precise because the gaps between are consistent and rounding only occurs when you’ve exhausted the available places, whereas with floating points we can represent larger and smaller numbers with the same amount of memory but we cannot represent all numbers within that range accurately and must round to fill in the gaps. COBOL was designed for fixed point by default, but does that mean COBOL is better at math than more modern languages? If we stuck to problems like .1 + .2 the answer might seem like a yes, but that’s boring. Let’s push this even further. We’re going to experiment with COBOL using something called Muller’s Recurrence. Jean-Michel Muller is a French computer scientist with perhaps the best computer science job in the world. He finds ways to break computers using math. I’m sure he would say he studies reliability and accuracy problems, but no no no: He designs math problems that break computers. One such problem is his recurrence formula. Which looks something like this: https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b 4/12 7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium From Muller’s Recurrence — roundo gone wrong That doesn’t look so scary does it? The recurrence problem is useful for our purposes because: It is straight forward math, no complicated formulas or concepts We start off with two decimal places, so it’s easy to imagine this happening with a currency calculation. The error produced is not a slight rounding error but orders of magnitude off. And here’s a quick python script that produces floating point and fixed point versions of Muller’s Recurrence side by side: from decimal import Decimal def rec(y, z): return 108 - ((815-1500/z)/y) def floatpt(N): x = [4, 4.25] for i in range(2, N+1): x.append(rec(x[i-1], x[i-2])) return x def fixedpt(N): x = [Decimal(4), Decimal(17)/Decimal(4)] for i in range(2, N+1): x.append(rec(x[i-1], x[i-2])) return x N = 20 flt = floatpt(N) fxd = fixedpt(N) https://medium.com/@bellmar/is-cobol-holding-you-hostage-with-math-5498c0eb428b 5/12 7/31/2020 Is COBOL holding you hostage with Math? | by Marianne Bellotti | Medium for i in range(N): print str(i) + ' | '+str(flt[i])+' | '+str(fxd[i]) Which gives us the following output: i | floating pt | fixed pt -- | -------------- | --------------------------- 0 | 4 | 4 1 | 4.25 | 4.25 2 | 4.47058823529 | 4.4705882352941176470588235 3 | 4.64473684211 | 4.6447368421052631578947362 4 | 4.77053824363 | 4.7705382436260623229461618 5 | 4.85570071257 | 4.8557007125890736342039857 6 | 4.91084749866 | 4.9108474990827932004342938 7 | 4.94553739553 | 4.9455374041239167246519529 8 | 4.96696240804 | 4.9669625817627005962571288 9 | 4.98004220429 | 4.9800457013556311118526582 10 | 4.9879092328 | 4.9879794484783912679439415 11 | 4.99136264131 | 4.9927702880620482067468253 12 | 4.96745509555 | 4.9956558915062356478184985 13 | 4.42969049831 | 4.9973912683733697540253088 14 | -7.81723657846 | 4.9984339437852482376781601 15 | 168.939167671 | 4.9990600687785413938424188 16 | 102.039963152 | 4.9994358732880376990501184 17 | 100.099947516 | 4.9996602467866575821700634 18 | 100.004992041 | 4.9997713526716167817979714 19 | 100.000249579 | 4.9993671517118171375788238 Up until about the 12th iteration the rounding error seems more or less negligible but things quickly go off the rails.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages12 Page
-
File Size-