1 1

Hilbert’s Program Revisited

Volume I

Base case: -computation

Dr Stephen Gaito PerceptiSys Ltd April 7, 2019

Perish then Publish Press.

1 1 2 2

This document was prepared using ConTEXt and LuaTEX on April 7, 2019.

Unless explicitly stated otherwise, all content contained in this document which is not software code is Copyright © 2019 PerceptiSys Ltd (Stephen Gaito) and is licensed for release under the Creative Commons Attribution-ShareAlike 4.0 International License (CC-BY-SA 4.0 License). You may obtain a copy of the License at http://creativecommons.org/licenses/by-sa/4.0/ Unless required by applicable law or agreed to in writing, all non-code content distributed under the above CC-BY-SA 4.0 License is distributed on an ‘AS IS’ BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the CC-BY-SA 4.0 License whose URL is listed above for the specific language governing permissions and limitations under the License. Again, unless explicitly stated otherwise, all content contained in this document which is software code is Copyright © 2019 PerceptiSys Ltd (Stephen Gaito) and is licensed under the Apache License, Version 2.0 (the "License"); you may not use this code except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed un- der the Apache 4.0 License is distributed on an ‘AS IS’ BASIS, WITHOUT WAR- RANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the Apache 2.0 License whose URL is listed above for the specific language governing permissions and limitations under the License.

2 2 3 3

Wee, sleekit, cowrin, tim’rous beastie, O, what a panic’s in thy breastie! Thou need na start awa sae hasty Wi bickering brattle! I wad be laith to rin an’ chase thee, Wi’ murdering pattle. (Robert Burns’ “To a Mouse” November 1785)

Question: What Physics could a wee beastie write down, if a wee beastie could write down Physics? Answer: Quantum Relativity.

This cycle of papers is dedicated to all beasties great and small who have given their lives so that one species can know itself.

3 Hilbert’s Program

3 3 4 4

Hilbert’s Program 4

4 4 5 5

Preface

Preface

Despite the wealth of fruitful work inspired by Hilbert’s Program, it is generally assumed that Gödel’s two Incompleteness Theorems show that Hilbert’s Program has failed. The purpose of this document is to show that while Hilbert’s logical Program is doomed to failure, a computational version of Hilbert’s Program is very definitely capable of providing a neo-platonist foundation of mathematics. The important distinction here is that Gödel’s theorems show that the set of ‘True’ statements of a given consistent formal theory which is at least as powerful as Peano Arithmetic, is a recursively enumerable set which is not recursive. This is what makes Hilbert’s logical Program fail. In this analysis, one of the two key requirements is that the formal theory is consistent. That is you can never ‘prove’ both a statement and its negation. It is this assumption of consistency which forces the set of ‘True’ statements to be recursively enumerable but not recursive. By working computationally, we can, with an assumption equivalent to the Axiom of Choice, build recursive sets which can provide interpretations of standard Zermelo–Fraenkel set theory (with the Axiom of Choice). For each ordinal, , we can define -computation. Standard computation is then -computation. Using a 1 computationally natural form of Vopěnka’s Principle into which we can then휆 build an interpretation휆 of Zermelo–Fraenkel set theory (ZFC), hence휔 showing that ZFC is (computationally) consistent. By taking Cantor’s Absolutely infinite multiplicities (AIMs)2 seriously, we can do a significant amount of classical analysis even with -computational resources. For a given amount of computational resources, we build a dual pair of structures 3 which correspond to well-founded sets and non-well-founded휔 proper classes, or in terms of the terminology of Computer Science, data and processes. Categorically these dual structures are a dual Topos / Co-Topos pair. While the classical Reals, are located in the well-founded sets (Topos) of ZFC and hence obey classical logic, the -computational Reals are processes (Co-Topos) and obey a weaker process logic. The important realization here is that the ‘strangeness’ of Quantum Me- chanics휔 comes from trying to understand the essentially process nature of Quantum Mechanics within the logic of classical sets. We assert that the natural outcome of developing the ‘logic’ of finite process structures is Quantum Relativity. Unfortu- nately, this is a topic for a subsequent document. In this volume, we will concentrate on the base case of -computation. Sub- sequent volumes will look at how to define the transfinite ordinals and the corre- sponding -computation for . With -computation, we휔 can then show that ZFC has a model and hence is computationally consistent. 휆 휔 ≤ 휆 휆 1 See Adámek and Rosiký’s Locally Presentable and Accessible Categories, we can build recursive sets of the size of a Vopěnka cardinal, [AR94]. 2 See Cantor’s letter to Dedekind dated 1899, [Hei67b]. 3 See Aczel’s Non-Well-Founded Sets,[Acz88], or Barwise and Moss’ Vicious Circles,[BM96].

5 Hilbert’s Program

5 5 6 6

TODO: make the recursively-enumerable/partially-recursive versus recursive/totally-recursive terminology more uniform.

Hilbert’s Program 6

6 6 7 7

Contents

Contents

Preface 5

Contents 7

1 Theory 9

1.1 Introduction 11 1.1.1 Goals 11 1.1.2 Some Philosophy 12 1.1.2.1 Hume’s problem 12 1.1.2.2 What is a thing? 12 1.1.2.3 A Neuron’s eye view 14 1.1.2.4 Mappae Mundi 15 1.1.2.5 Hilbert’s program 15 1.1.2.6 Identity 17 1.1.3 Some Mathematics 17 1.1.3.1 A tale of two foundations 17 1.1.3.2 What does a Mathematician do? 18 1.1.3.3 Cast of thousands 20 1.1.3.4 A Beastie’s environment 21 1.1.3.5 Barr’s ladder 23 1.1.3.6 Lawvere’s Theorem 26 1.1.3.A First section appendix 26 1.1.3.A.1 A test subsubsection 26 1.1.3.B Second section appendix 26 1.1.3.C Third section appendix 27 1.1.4 Some Computer Science 27 1.1.4.1 Extensional and Intensional Objects 27 1.1.5 Some Software Engineering 27 1.1.5.1 ’s Software Architecture 27 1.1.5.2 ’s Complexity Layers 28 1.1.5.3 홹횘횢홻횘홻’s processing components 31 1.1.6 Strategy홹횘횢홻횘홻 32 홹횘횢홻횘홻

7 Hilbert’s Program

7 7 8 8

1.2 Of Wee Beasties and Idealized Mathematicians 35 1.2.1 A wee beastie’s reality, the interpreter 35 1.2.2 What can a ‘wee beastie’ do? 36 1.2.3 What can a ‘wee beastie’ know?홹횘횢홻횘홻 39 1.2.4 Identity and Bisimulation 40 1.2.5 Rigour 41 1.2.A Texts: transcribing Lists of Lists 42 1.2.A.1 Characters 42 1.2.A.2 Symbols 44 1.2.A.3 Texts 45 1.2.A.4 Parsing 45 1.2.B Containers 45 1.2.B.1 Lists 45 1.2.B.2 Stacks 45 1.2.B.3 Dictionaries 46 1.2.B.4 Sets and Power Sets 46

1.3 Computational examples 47 1.3.1 Tree of binary processes 47 1.3.2 Power multi-set 47 1.3.3 Coalgebra 47 1.3.4 Co-Lawvere Theorem 47

1.4 Formal System 49 1.4.1 Mappae Mundi 50 1.4.2 Semantic Functor 53 1.4.3 Syntaxtic Functor 54

2 Implementing in 55

2.1 Abstract Syntax Trees 57 홹횘횢홻횘홻 홹횘횢홻횘홻 2.2 Parser 59

2.3 Type inferencer 61

2.4 Verifier 63

2.5 Interpreter 65

Bibliography 67

Glossary 71

Symbols 73

Subject Index 75

Hilbert’s Program 8

8 8 9 9

1 Theory

9 Hilbert’s Program

9 9 10 10

Hilbert’s Program 10

10 10 11 11

1.1 Introduction

1.1 Introduction

1.1.1 Goals

Every long term researcher should have a silly question, if only to keep them focused upon whole problems amidst all of the unending detail. Your ultimate destination, whether or not you get there, influences how you plan to get there. The problem for any researcher, is that the research ‘space’ is infinite dimensional. From the dazzle of choices, you must make ‘one’ sequence of choices which, one hopes, ultimately leads to your answer(s). The vantage point provided by your ultimate goal, influences the questions asked and hence the answers found. The choice of where you want to get to, does have a profound influence on how you choose to get there. My silly question is: Why do babies babble?

There are (at least) two aspects to this question:

1. How do organic naive ‘brains’ build models of reality. Too much of classical arti- ficial intelligence focuses on the incremental learning based upon the capabilities of ‘adult’ learners. However this misses the point of learning models of Reality ab initio.

2. Equally important, is the question of what constitutes an efficient model. Animal brains must keenly balance energy use with the comprehensiveness of a given model. Any animal that gets this wrong too much of the time, becomes someone else’s lunch.

So put simply, my objective, as a mathematician, is to build efficient and mathemat- ically rigorous models Reality. However, if you are going to build a mathematically rigorous theory of Reality, you must first provide a mathematically rigorous theory of Mathematics itself. This document is devoted to providing just such a rigorous foundation for Mathematics. Since the ancient Greeks, western influenced Philosophical, Scientific, Mathe- matical practice has been to equate rigour with proofs of ‘Truth’. Euclid’s ‘Ele- ments’ has, for just over two millennia, provided the pre-eminent example of this paradigm of rigorous proof. However, Gödel’s two Incompleteness theorems, show that classical proofs through logic are unable to provide the rigorous foundations we require. This is the failure of Hilbert’s logical Program. Essentially, Gödel’s work around the early 1930’s represent the first contributions to the theory of computation. Gödel’s theorems from this period concern themselves with our ability to compute ‘Truth’, and the fact that such computation of ‘Truth’, is only partially recursive, rather than totally recursive.

11 Hilbert’s Program

11 11 12 12

Some Philosophy 1.1.2

Instead of computing ‘Truth’, is there a more useful computation which might provide a foundation of Mathematics? By exploring Hilbert’s Program within a computational framework, the objective of this document is to show that the answer to this question is yes.

1.1.2 Some Philosophy In Western philosophy, since at least the time of the ancient Greeks, there have been a wide range of Philosophical theories of the ‘Reality’ of ‘Reality’. Our objec- tive in building a rigorous Mathematical theory of Reality is not to prove any of these Philosophical theories (in)correct. Instead our objective, and really the only one available Mathematically, is to explore what a finite computational device, ‘a wee beastie’, can learn about Reality.

1.1.2.1 Hume’s problem

Ignoring Hume’s ‘Problem of Induction’ for the moment, as a Scientist and Engineer, like any young child, I live in the belief that I can both learn about and, more importantly, interact with Reality. To bastardize Descartes, from moment to moment, I can see that I have made marks in the sand, therefore I am. It is naive to assert that finite beings, such as ourselves, can not learn to predict at least some of the future. Russell’s farmyard birds, given their limited cognitive abilities, are rational to expect to be feed daily4. However, for any finite being, there will always be events, some highly critical events, which are outside of that being’s ability to know about and hence predict. This is emphatically the ‘wee’ in my understanding of a ‘wee beastie’. For any finte being, there is always a more capable being, we might just not yet found these more capable beings. Understanding the limits of being finite, is the true import of Hume’s Problem. It is equally naive to assert that a finite being can not interact with their en- vironment. I can communicate with you over both distance and time. We can build (finite) computational devices. I am writing this document using onesuch device, you are no doubt reading this document using at least one other. So it is at least potentially reasonable to expect that finitist Mathematics could be founded computationally. TODO: What does it mean to exist mathematically?

1.1.2.2 What is a thing? From the range of the basic questions of metaphysics we shall here ask this one question: “What is a thing?” The question is quite old. What remains ever new about it is merely that it must be asked again and again.5

4 See chapter VI, ‘On Induction’, in Russell’s ‘The Problems of Philosophy’, [Rus12] near page 98 5 Martin Heidegger, page 1, first paragraph, in [Hei67a], as quoted by [DI08].

Hilbert’s Program 12

12 12 13 13

1.1 Introduction

A Zen Master would respond that there is no-thing, there is only is-isness, exis- tence in its entirety6. That quarrelsome ‘thing’, ‘I’, is only an illusion. The hardest thing any ‘one’ can do is to ignore the ‘I’ in order to ‘see’ the ‘is’. The dissolution of this ‘I’ is un-important in the context of the ‘is’. In our work, this point of view will be indispensable. However, for most of ‘us’, such a view point is very hard to hold. We all play a ‘me’-‘environment’ game with existence. This view point is equally important for our work. A ‘quark’ plays the absolute simplest of games, a ‘quark’-‘everything-else’ game. A ‘quark’, re-acts to its environment. Any model of a ‘quark’ is a (fairly) simple S-Matrix. A frog’s game is only slightly less simple, there is the frog, there are ‘things’ that are small enough to be potential food, there are ‘things’ that are so large they might be predators, and finally there are ‘things’ which might be potential mates. A frog’s ‘environment’ has some substructure, ‘prey’, ‘predators’, and ‘mates’. We assume that a frog’s brain models, to a sufficiently complex level of detail, these three ‘things’, however, by and large, frogs do not need to expend much more energy on playing any more complex games so they don’t. By modelling only the most important categories, frogs can save energy by not building and maintaining complex and energetically expensive nervous systems7 A human’s game is much more complex. We regularly, split ‘our’ environment into many many ‘things’. ‘Objects’ for which we build wide classes of models of their behaviour and even their potential internal, ‘intentional’, state of ‘mind’. Chairs and mugs have different uses. Metals and glasses, have different abilities tobe re-fashioned into useful tools. Animals have widely different behaviours providing useful companions or dangerous enemies. Even more complex, though, are our ‘models’ of other humans. Each person in our environment, has widely differing objectives of their ‘own’. All of which we must, and, by and large, do, keep track of. For each of these objects we take an intentional stance8, they each have various, though widely, differing abilities to re-act, or intend with ‘me’. These different

6 Indeed a Zen Master would refuse to use mere words. To slightly mix philosophies, ‘The Tao which can be named is not the Tao’, or again in Jewish tradition, God is not to be directly named. Words differentiate ‘things’, and Zen’s ‘existence-in-entirety’, the Tao and God are beyond all human limits to differentiate, identify or understand. We, as limited beings, can only experience. This is similar to Cantor’s expressed understanding of his Absolute Infinite Magnitudes, again, see Cantor’s letter to Dedekind dated 1899, [Hei67b]. 7 See, for example, Ewert’s Motion Perception Shapes the Visual World of Amphibians,[Ewe04]. While this reference focuses primarily on the visual system of amphibians, it indicates an overall lack of need for complex models in an amphibian’s nervous system. It is estimated, [RG02], that the adult human brain consumes around 20% of all calories consumed each day, yet the human brain only represents around 2% of our body weight. Nervous systems are relatively expensive to keep running. For most of a frog’s needs, this additional complexity is not needed, this is a frog’s evolutionary niche. 8 See Daniel Dennett’s ‘The Intentional Stance’, [Den87].

13 Hilbert’s Program

13 13 14 14

Some Philosophy 1.1.2

intentional abilities are reflected in the overall complexity in the various models we build to represent any particular ‘thing’. So how do we, most efficiently, build these models? Naively, we all ‘know’ what an object ‘is’ when we ‘see’ it. Physics suggests that all material things are made up of sub-atomic particles which are in turn made up of ‘quarks’. It is ‘obvious’ at one level that ‘I’ and ‘you’ are ‘different’ things. However, when I shake your hand, where do ‘my’ ‘quarks’ end and ‘your’ ‘quarks’ begin? At the ‘most basic’ level, we can not separate one ‘thing’ from another ‘thing’. A Zen master’s view is actually deeply entwined with any complete mathematical model of ‘things’. How do we reconcile these multiple levels of ‘being’ into one comprehensive and complete model of ‘Reality’? Andreas Döring and Chris Isham in their paper, What is a thing?,[DI08], suggest that Physics can best be captured using the variable sets point of view provided by Categorical Topos9. For us, a Topos over a collection of descriptive ‘levels’, provides the natural tool with which to capture the coherent variation of ‘thing-ness’ as our level of description varies. This suggests that the use of the structuralist Categorical point of view, in general, and the associated variable descriptive Topos point of view, in particular, will be very important to our work.

1.1.2.3 A Neuron’s eye view

So lets reflect for a moment on a neuron’s point of view. When a neuron fires, it ‘means’ something, but what does it mean? Deeply embedded inside a complex collection of other neurons, each collecting the spike trains from countless other neurons. In some sense each neuron integrates the ‘information’ conveyed by each of these spike trains from up-stream neurons. What sort of information should these spike trains be communicating? At the very least they should be communicating some important value. This is what artificial neural networks model. TODO: Bayesian brain, [DIP+07], Spikes, [RWD+99] Pouget, [BP07] and [KP04], markov models, Shalizi spatial models, [Sha01], section 10.2.1 Why Global States Are not Enough, required. Markov has re- strictive scope of description... but by recoding and using levels of de- scription, we can recover the effect of history on the present and future. TODO: talk about Wally’s oversight... [Wal91]. There is a deep dis- tinction between the ideal asymptotic reals and the finite subsequences of processes TODO: rework the following paragraphs So this paper generalizes the collected work of Spitters, Coquand, (see, for exam- ple, [CS09])10, together with Walley’s work on imprecise probabilities, to produce

9 See Lawvere’s concept of variable sets, [Law75] or [LR03]. 10 It is also important to see the related work of Heunen, Landsman and Spitters, [HLS09]

Hilbert’s Program 14

14 14 15 15

1.1 Introduction

a computable measure theory which a beastie could use. TODO: Add the work of [Jac06] and its generalization in Isham [DI08], Section 8.2 As has become traditional in Imprecise Probability theory, in his book, Sta- tistical Reasoning with Imprecise Probabilities,[Wal91], Walley makes the upper and lower previsions (expectations) the primary objects of study with upper and lower probabilities as derived concepts. Since we will be concerned with probability based Markov structures we will reverse this orientation. One of the reasons Walley choose to work with previsions (expectations) instead of probabilities is because of his belief that Lower Probablities did not determine Lower Previsions (see section 2.7.3 page 82, [Wal91]). In fact we will show below that with the correct definition of upper and lower measures and upper and lower integrals, lower probabilities do determine lower previsions. This will be the substance of the (Imprecise) Dedekind- Riesz Representation Theorms proven below. TODO: paragraphs above TODO: So what is a thing? A thing is an imprecise markov model at various levels of description. One of our ultimate goals is to provide a rigorous theory of computational imprecise measure theory sufficient to be able to provide the theory required to build computational ‘wee beasties’.

1.1.2.4 Mappae Mundi

From our ‘modern’ point of view, Mappae Mundi are, at best, highly dis- torted (navigational) ‘maps’. From the original ‘cartographer’s’ point of view, each Mappa Mundi provides a useful representation of an important understanding about the cartographer’s ‘world’. However, a modern notion of ‘navigation’ was highly un-likely to be the primary purpose of any given historical Mappa Mundi. Research Mathematicians who habitually cross (standard) mathematical disci- plines, of necessity, become collectors of mathematical mappa mundi. Each (math- eamtical) mappa mundi represents an idea in a given discipline, which is slightly ‘wrong’ or ‘mis-shappen’ for the research mathematician’s problem(s) at hand. The ‘problem(s) at hand’ are very unlikely to be well represented (if at all) on any existing mathematical mappa mundi. This is after all the point of mathematical research. TODO: list my mappae mundi

1.1.2.5 Hilbert’s program

The primary objective of Hilbert’s ‘Program’, was to provide a rigorous foun- dation for Analysis whose metamathematics is finitist. A modern paraphrase of Hilbert’s program would be the following (logicist’s) statement:

15 Hilbert’s Program

15 15 16 16

Some Philosophy 1.1.2

Find a logical axiomatization of mathematics whose logical proof of consis- tency is ‘finitist’. Hilbert was, himself, notoriously vague about what constituted a ‘finitist’ (logical) proof. However his program helped develop our current understanding of meta- mathematics, logical proof, as well as finitism. A statement of a computational version of Hilbert’s program is very similar: Find a computational axiomatization of mathematics whose computational proof of computability is, at some meta-level, finite. A (formal) logical proof that a logical statement about a corresponding math- ematical object, typically that the object exists and satisfies some collection of properties, is a textual description of a logical derivation of the truth of the given logical statement, using one or other particular logical derivation system. Using a transfinite logic, it is possible that the complete unfolding of the textual descrip- tion might itself be transfinite. However the derivation that the textual description describes is a mathematical object in its own right and hence the truth of the de- rivation can be subject to its own proof. The distinctive structure here is that a mathematical object at one level is proved to exist and satisfy various properties, using a mathematical object (a proof derivation) at one meta-mathematical level higher. So a modern interpretation of Hilbert’s Program would be to show that there exists a finite derivation of the proof of the consistency of, for example, Zer- melo–Fraenkel set theory with the Axiom of Choice (ZFC). The objects in ZFC might be transfinite, and/or have transfinite proofs of existence, butthe consistency proof of ZFC as a ‘whole’ must be finite. As is well know, Gödel’s two Incomplete- ness Theorems show that any reasonably useful axiomatization of mathematics, has no finite proof of consistency. Similarly, a computational proof that a computer program computes what it purports to compute is the execution of a proof verification program. This proof verification program, being computational can itself be subject to its own computa- tional proof of correctness. Again the distinctive structure here is that, a program at one level is proved correct using a program at one meta-computational level higher. While a given computation might have a transfinite unfolding (or ‘trace’), the description of the computation, the computations ‘program’ is a textual object which can be manipulated by another program. To be ‘finitist’, at some meta- compuational level the proof of correctness must have a finite trace. TODO: discuss the concept of data versus processes. classical com- putation theory has been mostly focused upon data not processes. The classical Reals, which are required for classical Analysis, are uncountable. Once we have defined the Ordinals, we will find that there is a reasonable definition of -computation for each ordinal, . If we assume a computational equivalent to the ‘standard’ Axiom of Choice, then we can define the transfinite ordinals and hence휆 computational structures which휆 can interpret the uncountable collection of

Hilbert’s Program 16

16 16 17 17

1.1 Introduction

the Reals (as data). Using this structure, we will then be able to develop the classical theory of Analysis, which was Hilbert’s ultimate goal. Alternatively, we can define the Reals as (measurement) processes, with out re- quiring any transfinite ordinals. The resulting theory of Analysis will not be quite classical, since the internal logic of the collection of processes, is not classical. How- ever, I conjecture that this process logic provides an explanation of the ‘strangeness’ of, and hence the correct foundations for, Quantum Mechanics. TODO: Need to introduce collection of processes as a (co)algebraic collection/structure.... we distinguish different processes by observing them... TODO: For this work the focus upon the well-founded/data/alge- braic versus the non-well-founded/process/co-algebraic is all pervasive. TODO: Categorical thought == structuralist point of view. Quote [Awo09]. TODO: discuss reals as data versus reals as processes == im- precise Reals.

1.1.2.6 Identity

TODO: Talk about identity of a shell/pea game. At the quark level of description there is no shell, pea or table to ‘identify’. A a ‘human’level, we use mental models to track our understanding of the effects of the shells and table upon the motion and eventual location of the ‘pea’. TODO: Our usual notion of identity depends upon our intent. Do we care about the pea in the shell/pea game. Is there some gain for us in keeping track of the pea? TODO: The identity of a ‘thing’ usually does not extend across too many levels of description. TODO: Mathematically same means that at some metalevel we have a finite collection of tests.

1.1.3 Some Mathematics

1.1.3.1 A tale of two foundations

In this document, we are explicitly re-founding mathematics using a compu- tational as opposed to a logical tool-set. To a classically trained mathematician, these computational foundations will be strange at first. The complexity of the foundations of any building prefigure the building itself. However any foundations only make sense if one reflects on what the building will be rather than what the foundations are. It is no different in mathematics. To help overcome the initial strangeness of these computational foundations, we will provide a running commentary using (as yet) classical mathematical termi- nology. Once we have the new foundations secure we can translate all of classical

17 Hilbert’s Program

17 17 18 18

Some Mathematics 1.1.3

mathematics into the new tools. However, until the foundations are secure, we need to carefully distinguish between extra-foundational commentary using clas- sical mathematics, typically, classical Category theory, and the actual re-founded foundations. Classical commentary As this paragraph shows, we will distinguish any extra-foundational comments by placing them between grey angled over and under bars. The angled over bar will also contain the words Classical commentary. From the beginning of the next subsection, all extra-foundational comments will be carefully delineated from the re-foundations themselves.

1.1.3.2 What does a Mathematician do?

Classical commentary With a computational foundation for Mathematics, from the point of view of Computer Science, the task of Mathematics, is to provide various specialized rigor- ous programming languages. Each of these programming languages provides users, engineers, scientists and other mathematicians, languages in which complicated computations are easier to understand and perform. The languages of Group the- ory, Lie Algebras, Differential Topology, Number theory, and Algebraic Geometry, are just one scattered collection of examples. Any given programming language consists of a pair of a syntax and a corre- sponding semantics. The syntax defines which finite texts represent valid static descriptions of the dynamic unfolding of various computations. The correspond- ing semantics provides a compositional interpretation of the meaning of any given syntactic text. In the theory of Computer Science, the connection between a given syntax and a given semantic model, is a collection of GSOS laws. From a categor- ical point of view, any collection of GSOS laws is represented by a distributivity law, a natural transformation, between a pair of endo-functors. The initial algebra of one of these endo-functors represents the syntax of the programming language. The final coalgebra of the other endo-functor represents the semantics. The natural transformation itself, is effectively an interpreter for the programming language. The importance of this description is that there can be many syntaxes represent- ing the various mathematical languages, areas or disciplines. Equally, there could be many different semantics into which a given syntax is interpreted. Atthemo- ment, within the logical foundation of mathematics, the semantics of mathematics is generally agreed to be set theory augmented with first-order logic. That is, it is generally assumed that any mathematical discipline can be transcribed or inter- preted in the language of first-order set theory, which in turn provides a ‘rigorous’ meaning to statements in the original discipline.

Hilbert’s Program 18

18 18 19 19

1.1 Introduction

For a computational foundation of Mathematics, we seek a semantics in which various syntaxes can be easily interpreted via GSOS laws. While there may be many categorically equivalent semantics, including, for example, some form of first- order set theory, we will, in this document, base our semantics on Lists of Lists. We will show that these Lists of Lists are effectively a fixed point of the ‘semantic interpretation functor’. A critical criteria which the Lists of Lists semantics satisfies which almost any form of first-order set theory will not satisfy, is a combination of textual andcon- ceptual simplicity. Like any axiomatic theory, the basic axioms must be assumed ‘true’, or in our compuational case ‘computationally correct’. Our Lists of Lists semantics will rely on a very small collection of computations which are, for Lists of Lists, both textually simple and ‘obviously computationally correct’. In this computational interpretation, what is a proof? Classically we have a dis- tinction between constructive and (non-constructive) existence proofs. Constructive proofs are, by and large, essentially computations. Indeed in one of the most com- mon formalisms of constructive logic, Per Martin-Löf’s type theory, there is a the- ory of how to extract computational programs from the constructive type theoretic proof. We will argue that all non-constructive existence proofs correspond to searches. The proof by contradiction is essentially a proof that a given search algorithm will complete, we just do not know how or when. Nor can we provide a closed form solution, all we can provide is a specification of what a given solution, once found, will satisfy. Given that vanishingly few existing computer programs are proven completely correct, how do we know that a given program text computes what it purports to compute? At a high level, we use Hoare’s system of pre and post conditions and show that the given program text, if started in an environment satisfying its preconditions will, if it halts, leave its environment in a condition which satisfies its postconditions. Since any semantic interpretation of a programming language is compositional, we can recursively apply Hoare’s pre and post conditions to each sub-text of a given program until we ultimately reach atomic statements which are declared to satisfy particular pre and post conditions. This is not dissimilar to how we currently structure a fully formal proof in set theory. In mathematics based upon either (classical) logic or computation, we need a logic. Unfortunately our existing first-order logic does not ‘deal with’ the underlying dynamics of computation. Equally importantly, existing (classical) logic is deemed to be the ‘structure’ of ‘human’ thought (and argumentation). That is first-order logic is essentially extra-mathematical as it pre-figures mathematical discourse. The logic we will use to establish the computational correctness of a given pro- gram text, will be the -modal logic of an underlying Interpreted Transition System associated with the chosen semantic model. This logic is inherently designed to deal with the dynamics of휇 computation. More importantly, the -modal logic we will define, will be intimately related to the structure of our semantic interpretation 휇 19 Hilbert’s Program

19 19 20 20

Some Mathematics 1.1.3

of computation. Once we define our semantic interpretation, the -modal logic is given. Even more importantly there are well defined algorithms based upon the the- ory of two person parity games which can compute the satisfiability휇 of the collection of pre and post conditions asserted about any particular program text purporting to itself compute a mathematical result. So in this computational interpretation, what do typical mathematicians do? Some search for new algorithms to solve new problems. Some worry about finding the most conceptually elegant (efficient) algorithm to program the proof of agiven theorem. Others worry about the expressivity of the language they use in a given discipline. Yet others worry about the semantic interpretation of the particular language they use. In all of these cases the annotation of any given algorithm with a satisfiable collection of -modal pre and post conditions is required for any completely rigorous mathematical result. 휇

1.1.3.3 Cast of thousands

Classical commentary To help us navigate through our relatively complex ‘story’, it will be useful to introduce our cast of thousands:

1. Transition Systems the basis of -modal logic for processes. A transition system whose transition relation is transitive, is a Category (and is the basis of Dynamic Logic). 휇 2. Categories

3. The category of LoL s: most important for our work will be the category of Lists of Lists, LoL . The objects of LoL will be the collection of wf-LoLs. The morphisms of LoL wfwill be any finite sequence of the list operators, , car, cdr and nil. Compositionwf of morphismswf in LoL , is simple concatenation of finite sequences of wflist operators. wf 4. The category of LoLs: most important for our work will be the category of Lists of Lists, LoL. The objects of LoL will be the collection of LoLs. The morphisms of LoL will be any finite sequence of the list operators, cons, car, cdr and nil. Composition of morphisms in LoL, is simple concatenation of finite sequences of list operators.

5. The LoL functor: LoL LoL defined by

wf wf wf wf (1.3.1) 퐿 : →

For LoL . Note that wfis an endo-functor of the category LoL . 퐿 (푋) = 1 +푋 ×푋

wf wf wf Hilbert’s푋 Program ∈ 퐿 20

20 20 21 21

1.1 Introduction

6. The LoL functor: LoL LoL defined by

(1.3.2) 퐿 : → For LoL. Note that is an endo-functor of the category LoL. 퐿(푋) = 1 +푋 ×푋 7. Algebras of : Given any 푋 ∈ 퐿 8. CoAlgebras of : 퐿 9. LoLfunctor as a monad: 퐿 10. LoLfunctor as a comonad:

11. Eilenberg-Moore category of as a monad:

12. Eilenberg-Moore category of as a comonad: 퐿 13. Kleisli category of as a monad: provides the natural category in which 퐿 to discuss the process traces, and (eventually) space-time. 퐿 14. Kleisli category of as a comonad:?

15. Multi-sets: We implement ‘sets’ using LoLs using a ‘mulit-set’. Basically a 퐿 multi-set is a list of elements which might have multiple ‘copies’ of any given element. For ‘sets’/‘multi-sets’ in LoL we could sort the list and remove any duplicates. However, while it is possible to sort (in the limit) non-well-founded objects in LoLand/or non-well-foundedwf lists of objects, any finite computational approximation will by necessity be a multi-set. Hence we generally deal with mulit-sets instead of sets.

16. Multi-powerset: Generalizing powersets we get multi-powersets. The (co)(con- tra)variant (multi-)powerset, , is a (co)monad.

17. GSOS laws: see [Jac17] definition 5.5.6 on page 323. 풫

1.1.3.4 A Beastie’s environment

Classical commentary From a classical point of view, the atomic actions that a wee beastie can perform on its environment, defines a ‘List of Lists’ or ‘Trees’ endo-functor, Set Set, from the Category of sets, Set, to itself: 푇 : → Set Set (1.3.3)

21푇 : → Hilbert’s Program

21 21 22 22

Some Mathematics 1.1.3

This functor is defined by:

(on objects) (1.3.4) (on morphisms) 푇 : 푋 ↦ퟏ+푋 ×푋 Let LoL , denote the set of finite, -computationally11 well-founded, binary trees. 푇 : 푓 ↦ퟏ+푓 ×푓 Then there is an isomorphism: wf 휔 LoL LoL (1.3.5)

Recalling that LoL LoL LoLwf , thenwf this isomorphism is defined by: 훼 : 푇( ) ⟶∼

wf wf wf 푇( ) = ퟏ+ × (1.3.6) 훼(⋆) = 횗횒횕 This isomorphism makes LoL an initial algebra, that is an initial object in the 훼(푥,푦) = 회횘횗횜(푥,푦) category of -algebras, . On ‘paper’, any particularwf -computationally well-founded binary tree can be denoted by푇 either a balanced퐀퐥퐠(푇) collection of left and right round brackets, ‘(’ and ‘)’ 12 or a collection of the functions 휔cons and nil . For example: (1.3.7)

or alternatively ((() ()) ()) (1.3.8)

both denote the binary tree: 회횘횗횜(회횘횗횜(횗횒횕, 횗횒횕), 횗횒횕)

회횘횗횜

회횘횗횜 횗횒횕 (1.3.9)

We will explicitly provide parsers for both notations below. 횗횒횕 횗횒횕 Let LoL, denote the set of potentially countably infinite, -computationally non- well-founded, binary trees and maps between them. Then, there is an isomorphism: 휔 LoL LoL (1.3.10)

11 For a given ordinal, , the concept of휁 ‘: -computational⟶∼ 푇( power’,) is critical to our theory. In the next volume, once we have progressed far enough to be able to define the ordinals, we will de- fine -computation as휆 the computing ‘power’휆 associated with a wee beastie whose computational ‘traces’ are at most in length. 12 The휆 function nil, being nullary, may be written with or without left, right pairs of round brackets. 휆 Hilbert’s Program 22

22 22 23 23

1.1 Introduction

Recalling that LoL LoL LoL, then this isomorphism is defined by:

푇( ) = ퟏ+ × (1.3.11) 휁(횗횒횕) = ⋆ This isomorphism makes LoL a final coalgebra, that is a final object in the category 휁(횡) = (회횊횛(푥),회획횛(푥)) of -coalgebras, . Assuming a wee beastie has -computational power, then clearly any LoL a wee푇 beastie might퐂퐨퐀퐥퐠(푇)create, from scratch, must be well-founded. However, given any already existing LoL, a wee beastie휔 can not know if it is well-founded or not. Hence, for our work, the collection of non-well-founded LoLs is the primary collection. The sub-collection of well-founded LoLs, is important but secondary.

1.1.3.5 Barr’s ladder

Classical commentary Because of its importance for our work we explicitly work out Barr’s Theorem 3.2, [Bar93] and [Bar94], for the ‘Lists of Lists’ or ‘Trees’ endo-functor defined above. We begin by noting that:

1. is the empty set in Set. It is also the initial object of Set (up to isomorphism).

2. is the singleton set in Set. It is the final object of Set (up to isomorphism). ∅ To be definite, we will use which is the set whose only element is ‘ퟏnil’. ퟏ = {횗횒횕} 3. Since and are, respectively, the initial and final objects in Set, we know that there is a unique morphism between them: . Furthermore since there are no∅morphismsퟏ into , the morphism, is trivially monic. Explicitly, as a mapping in Set, the morphism has the empty푘 : ∅ graph. → ퟏ ∅ 푘 4. The set, . 푘 5. The set, . 푇(∅) = {횗횒횕} = ퟏ 6. Since and is the initial object,2 we know that . 푇(ퟏ) = {횗횒횕,(횗횒횕,횗횒횕)} = 푇 (∅) 7. Explicitly, is the morphism , hence 푇(∅) = ퟏ ∅ 푘 = 푗 8. Since is the final object in Set, there is a unique morphism . 푇(푘) : 푇(∅) →푇(ퟏ) ퟏ+푘 ×푘 푇(횗횒횕) = 횗횒횕 Explicitly we have: ퟏ 푡 : 푇(ퟏ) → ퟏ

23 Hilbert’s Program

23 23 24 24

Some Mathematics 1.1.3

푡(횗횒횕) = 횗횒횕 Hence is the identity of . In particular, is surjective. 푡((횗횒횕,횗횒횕)) = 횗횒횕 9. Since there is a unique morphism between and , we know that . 푡 ∘ 푇(푘) 푇(∅) = ퟏ 푡 ∘ 푇(푘) 10. The above ensures that ∅ ퟏ 푘 = 푡∘푇(푘)∘푗 a. there exists unique embedding morphisms, LoL .

푛 b. there exists unique projection (or truncation)푛 morphisms LoLwf 푗 : 푇 (∅) → . 푛 푛 푛+1 푡 : →푇 (ퟏ) = c. there exists a unique embedding morphism, LoL LoL. 푇 (∅)

d. there exists a unique projection (or trunction)∞ morphism,wf LoL LoL . 푗 : →

This projection makes LoL LoL into a fibration with∞ LoL the totalwf space and LoL the base space. For푡∞ a complete discussion of our푡 use: of fibrations→ wf to build up logics, see [Jac99→]. TODO: Is this s cloven split firbration? The fibreswf in this fibration represent the sets of all completions ofagiven finite tree. In the terms of Symbolic Dynamics, each fiber over a given finite tree, is a cylinder. The collection of all cylinders forms a basis for a topology for LoL. In fact this topology is given by the metric, for the

largest for which (see, for example, [Bar93]). −푛 As with anything in Category theory, all of the above are푑(푦,푦 unique′) =up 2 to isomor- 푛 푛 phisms. 푛 푡 (푦) = 푡 (푦′) 11. TODO: What more needs to be stated? Look at [RT94] section 3.4

LoL

wf

0 1 푛 푗 푗 푛+1 푗 푗 푛 푛+1 ∅ 푇(∅) ⋯ 푇 (∅) 푛 푇 (∅) ⋯ 푗 푇 (푗) 푛 푛+1 ∞ ∞ 푇(푘) 푇 (푘) 푇 (푘) 푡 푗 푘 푛+1 푛 푛+1 푡 푇 (푡) ퟏ 푇(ퟏ) ⋯ 푇 (ퟏ) 푇 (ퟏ) ⋯ 푛+1 푛 0 1 푡 푡 푡 푡 LoL (1.3.12)

Hilbert’s Program 24

24 24 25 25

1.1 Introduction

Classical commentary TODO: We want to show that the category, LoL , is a topos and that the corresponding category, LoL, is a co-topos. For LoL , the objects are finite trees, and the morphisms are computationswf (traces). For LoL, the objects are finte ‘cylinders’ on the underlyingwf set, LoL. The morphisms are again computations홹횘횢홻횘홻(traces).0

Classical commentary홹횘횢홻횘홻0 Definitions 2.2.1 and 2.3.1 together define a pair of syntax, Set Set, and behaviour, Set Set, endo-functors from the Category of Sets, Set, to 13 itself together with a collection of GSOS rules . Σ: → The syntax functor:퐵 : → Set Set (1.3.13) Σ: → is defined by Σ : ퟏ+푋 ×푋 ↦푋

(1.3.14) Σ(⋆) = 횗횒횕 Let LoL , denote the sub-category of Set of finite, -computationally14 well- Σ(푥,푦) = 회횘횗횜(푥,푦) founded, binary trees, and maps between them. Then, wf 휔 LoL LoL (1.3.15) LoL LoL wf wf Σ: → is an isomorphism making LoL an initialwf algebrawf, that is an initial object in the Σ( ) ⟶∼ category of -algebras, . Similarly, the behaviour functor:wf Σ 퐀퐥퐠(Σ) Set Set (1.3.16) 퐵 : → is defined by 퐵 : 푋 ↦ퟏ+푋 ×푋

13 See, for example, [TP97], [Kli11], and [Jac17]. 14 For a given ordinal, , the concept of ‘ -computational power’, is critical to our theory. In the next volume, once we have progressed far enough to be able to define the ordinals, we will de- fine -computation as휆 the computing ‘power’휆 associated with a wee beastie whose computational ‘traces’ are at most in length. 휆 25휆 Hilbert’s Program

25 25 26 26

Some Mathematics 1.1.3

(1.3.17) 퐵(횗횒횕) = ⋆ Let LoL, denote the sub-category of Set of potentially countably infinite, -computationally 퐵(횡) = (회횊횛(푥),회획횛(푥)) non-well-founded, binary trees and maps between them. Then, 휔 LoL LoL (1.3.18) LoL LoL Σ: → is an isomorphism making LoL a final algebra, that is an final object in the category ⟶∼ Σ( ) of -coalgebras, .

Σ 퐂퐨퐀퐥퐠(Σ) 1.1.3.6 Lawvere’s Theorem

1.1.3.A First section appendix

adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg

1.1.3.A.1 A test subsubsection

adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg

1.1.3.B Second section appendix

adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg

Hilbert’s Program 26

26 26 27 27

1.1 Introduction

1.1.3.C Third section appendix

adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg adlfjoud fodf dofoadufoudof ododu dfuosd fo dofuodfodof odfod adoiu dfosd fusd aoudf odfud foud s dofosd fusdoudfoso d sdf osdf oudsodu sdfu sdf d udo sd s df sd dg g dg dg dg gwg g dg dg dg

1.1.4 Some Computer Science

TODO: Much of the discussion in introMathematics is actually de- rived or inspired by issues in Computer Science and not Mathematics per se. I need to either extract it here OR re-title the previous section.

1.1.4.1 Extensional and Intensional Objects

1.1.5 Some Software Engineering

1.1.5.1 ’s Software Architecture

Software Engineering looks at the practical aspects of making computational software work홹횘횢홻횘홻for a specific purpose. An important practical aspect of a computation is its ultimate speed of perfor- mance. A computation which takes an apreaciable part of the lifetime of an activity to complete is not much use for managing that activity. It is certainly important that the software writen in be performant. However since software is (re)-engineered by many individuals over a consider- able length of time, the ‘maintainability’홹횘횢홻횘홻 of the textual structure of the software can be equally or even more important than raw performance in many cases. For our purposes, this maintainability, this ability to understand the software’s (textual) structure is very important. In terms of software engineering, this is called the ‘systems architecture’ of the overall problem solution. For our needs there are two distinct ways to understand ’s software ar- chitecture. Firstly as a layering where each additional layer adds complexity. For our purposes, these layers of complexity are: ‘purest 홹횘횢홻횘홻’, ‘pure ’ and ‘performant ’. Secondly, we can understand ’s software architecture홹횘횢홻횘홻0 as collections홹횘횢홻횘홻 of interacting ‘components’,홹횘횢홻횘홻 where each ‘component’ performs a specific conceptual task. This collection of ‘components’홹횘횢홻횘홻 can be organized either by what depends

27 Hilbert’s Program

27 27 28 28

Some Software Engineering 1.1.5

upon what to perform a computation, what ‘calls’ what, or what depends upon what through the time-line of a computation, what happens first, second, third, etc.

1.1.5.2 ’s Complexity Layers In the following three diagrams, we attempt to depict the inter-related complex- ity layers홹횘횢홻횘홻 as well as hint at the ‘call’ dependency of important components. At the ‘highest level’ of ‘call’ dependency, we have a number of ‘ coalgebras’. These in turn depend upon, for example, the interpreter, the ‘CONS-pairs’ and ultimately ‘memory management’. 홹횘횢홻횘홻 홹횘횢홻횘홻

0

...

Symbols

Numbers

Characters

퐉퐨퐲퐋퐨퐋 Joylol interpreter

CONS-Pairs

Figure 1.1 Purest

Pure JoyLoL홹횘횢홻횘홻0

...

Symbols

Characters

0

Natural numbers

퐉퐨퐲퐋퐨퐋 Joylol interpreter

CONS-Pairs

Figure 1.2 Pure

Hilbert’s Program홹횘횢홻횘홻 28

28 28 29 29

1.1 Introduction

JoyLoL

Alt

Alt

Alt

......

Symbols

Characters

Symbols

OS interfaces

Characters

0

Natural numbers

Natural numbers

퐉퐨퐲퐋퐨퐋 Joylol interpreter

CONS-Pairs

Memory Management

X libraries) ANSI-C Implementation E / Lua wrapper

(GMP libraries)

(OS / LuaT

(UTF-8 implementation)

Operating System / LuaTEX

Figure 1.3 A Software Architectural description of a preformant imple- mentation 홹횘횢홻횘홻 In terms of conceptual complexity, (Logical) set theory has two distinct vari- ants, pure set theory without ‘ur-elements’, and naive set theory which allows ‘ur- elements’. ‘Ur-elements’ are ‘atomic’ elements which might be elements of some set, but which contain no (sub)elements of their own. Similarly there are three distinct types of :

홹횘횢홻횘홻 29 Hilbert’s Program

29 29 30 30

Some Software Engineering 1.1.5

• Purest which allows only the characters ‘(’ and ‘)’ and lists of lists. It is this version of which is the most similar to pure set theory. (See figure 1.1홹횘횢홻횘홻) 0 • Pure which홹횘횢홻횘홻 allows explicit textual ‘symbols’ and ‘natural numbers’, as well as lists of lists. This is the verion of which is most similar to naive set theory홹횘횢홻횘홻 as it is used in typical mathematical practice. (See figure 1.2) • Performant implementations which홹횘횢홻횘홻 allow arbitrary more performant implementations of coalgebras using non- computing languages. This is the version홹횘횢홻횘홻 of which is most similar to various computational proof assistants such as Agda, Coq, ACL2, NuPRL,홹횘횢홻횘홻 HOL, and Isabelle. (See figure 1.3) 홹횘횢홻횘홻 The dialect of without ‘ur-elements’, , is the purest form of . In , the only allowed characters are ‘(’ and ‘)’. In everything is a list of lists. Unfortunately,홹횘횢홻횘홻 for ‘bears of little홹횘횢홻횘홻 brain’,0 such as myself, this홹횘횢홻횘홻 is a far too홹횘횢홻횘홻 austere0 language in which to think about mathematics.홹횘횢홻횘홻 Hence,0 to help ‘bears of little brain’ understand what they are doing, we allow to have atomic symbols. Since is computational, to be useful, we need홹횘횢홻횘홻 to embed im- plementations into one or more modern computation systems. Commonly used coalgebras,홹횘횢홻횘홻 such as for example, the natural numbers could be implemented홹횘횢홻횘홻 as list of list structures complete with associated (recursive) addition and 홹횘횢홻횘홻multiplication. However, since the addition and multiplication of natural numbers is so홹횘횢홻횘홻 pervasive0 in mathematics, a pure (computational) implementation as lists of lists is unlikely to be performant enough for general use. This means we will generally need to allow more performant alternative implementations홹횘횢홻횘홻 of many0 important coalgebras such as the natural numbers. This brings the question of how do we tell if two ‘implementations’ of a concept behave in the ‘same’ way?. The behaviour of a collection of ‘things’ are traditionally captured as a coalgebra. The most important thing about the definition of coalge- bras is that ‘identity’ is captured via (behavioural) ‘bisimulation’. Two coalebras behave the same if they are bisimular. Hence, two implementations behave the same if they are bisimular. While an implementation of the natural numbers in as lists of lists will be transparently code whose behaviour can be rigorously verified, typical implementations of the natural numbers in, for example홹횘횢홻횘홻0 ANSI-C, contain behaviours which are not as홹횘횢홻횘홻 transparent.0 For example, both the as well as the GNU Multiple Precision Arithmetic (GMP) Library’s implementation of the natural numbers on any particular computer system, will have a홹횘횢홻횘홻 concrete0 bound on the size of the largest natural number it can represent, if only because the computer system has run out of all of the memory resources it may use. However, GMP’s implementation will also have internal structures whose memory use will po- tentially impose additional behaviours which might not be explicitly documented in

Hilbert’s Program 30

30 30 31 31

1.1 Introduction

GMP’s formal description. Given the complexity of GMP’s implementation, GMP does not have a rigorous proof of correctness. Indeed the ANSI-C compilers used to compile and typically do not have rigorous proofs of correctness either. The GNU홹횘횢홻횘홻 Multiple0 Precision홹횘횢홻횘홻 Arithmetic Library (GMP) is a free library for arbitrary-precision arithmetic, operating on signed integers, rational numbers, and floating point numbers (Wikipedia). We only use the unsigned integers. UTF-8 is a variable width character encoding capable of encoding all 1,112,064 valid code points in Unicode using one to four 8-bit bytes. The encoding is defined by the Unicode standard, and was originally designed by Ken Thompson and Rob Pike (Wikipedia). We (explicitly) only use the ASCII subset.

1.1.5.3 ’s processing components

In diagram 1.4 we depict the major components required for a given computa- tion. Focusing홹횘횢홻횘홻 upon the processing steps, a given computation typically unfolds as follows:

1. A parser takes some code as text and transforms it using the syntactical rules of the language , into an (potentially) untyped abstract syntax tree (AST). 홹횘횢홻횘홻 2. A type inferencer takes an untyped AST and computes types for all implied pre and post stack frames for all statements. Our type inferencer uses the Hindley-Milner type inference algorithm essentially altered for stack based languages as suggested in [Dig08].홹횘횢홻횘홻 3. A verifier takes the typed AST and verifies that the pre and post conditions for each statement is correct. Our verifier is based the -modal logic satisfication checking algorithms in [DGL16] and [HKT00]. 홹횘횢홻횘홻 휇 4. An interpreter takes the verified AST and interprets it given any supplied input data. The interpreter might create either output data or a new code as text (or both). Our interpreter is based upon ideas taken from [Thu96] and continuations taken from [Gor79] (as well as the ‘standard’ ideas of ‘trampolining tail recursion in ANSI-C’).

TODO: Discuss how to deal with lack of rigor in implementations (See JoyLoL implementation paper(s)). Note that this is a problem even in Science. How do we know that a given model, models ‘Reality’. TODO: Discuss the importance of being able to add new Coalgebra implementations using, for example, , ANSI-C or Lua.

홹횘횢홻횘홻 31 Hilbert’s Program

31 31 32 32

Strategy 1.1.6

Code as Text

Parser

Code as untyped AST

Type Inferencer

Code as typed AST

Verifier

Code as verified AST

Interpreter Input Output data data

Figure 1.4 Compiling

1.1.6 Strategy 홹횘횢홻횘홻 This document will provide a rigorous computational foundation of Mathemat- ics, by... TODO: Need to talk about lists of lists We will do this in a number of distinct steps. Firstly, by defining a computa- tional langauge, JoyLoL (‘The Joy of Lists of Lists’15). JoyLoL is a functional con-

15 Or is it ‘The Joy of Laughing out Loud’?

Hilbert’s Program 32

32 32 33 33

1.1 Introduction

catenative language based upon Manfred von Thun’s language Joy, [Thu94b]. The critically important aspect of JoyLoL is that it is constructed to be a fixed point of the semantics functor. This means that JoyLoL provides its own denotation, opera- tional and axiomatic semantics. JoyLoL does not rely upon any other ‘pre-existing’ structures or set theory to define its meaning. An other important aspect of JoyLoL is that it is a concatenative function language. Almost all other functional program- ming languages are based upon Church’s -calculus, importantly, this means that most such langauges are focused upon function evaluation and substitution. From a categorical point of view, this means that휆 the collection of computational traces forms a Topos. Being concatenative, the collection of JoyLoL computational traces forms a Category, which also happens to be a Topos. The distinction here is impor- tant. The requirements of being a Category are much simpler and valid of many more distinct sub-collections of computational traces. , we can construct the structure of all JoyLoL computational traces. We can define the collection of finite substructures as those substructures for which a simple short JoyLoL program halts. These finite substructures Since there are JoyLoL computations which do not halt, this structure

33 Hilbert’s Program

33 33 34 34

Strategy 1.1.6

Hilbert’s Program 34

34 34 35 35

1.2 Of Wee Beasties and Idealized Mathematicians

1.2 Of Wee Beasties and Idealized Mathemati- cians

In this chapter, over the next four definitions, 2.1.1, 2.2.1, 2.3.1, 2.4.1, below, we will explore what a wee beastie is, can do, can know, and can identify. We do this by formally and computationally defining a wee beastie’s ‘Reality’ using the rigorous computational language . In fact, since we identify any wee beastie with a computation, our definitions in this chapter are actually defining the ‘core’ of the computer홹횘횢홻횘홻 language. These four definitions are, of necessity, inter- 홹횘횢홻횘홻referential. They are essentially different facets of the same combined definition of what the홹횘횢홻횘홻 computer langauge is, as well as what a wee beastie’s reality is. Intuitionistic Mathematics, as initiated by Brouwer, has identified the concept of the ‘Idealized홹횘횢홻횘홻 Mathematician’. Since this document is focused upon the Math- ematics of the ‘Reality’ which a wee beastie can know, for this document, we will identify any wee beastie with this Idealized Mathematician, and conversely, any Idealized Mathematician, with this wee beastie. Unlike Kant, this idealization makes no assumptions about the structure of time or space. One consequence of our analysis of what a wee beastie can know, is space- time itself. Following Hilbert, we will assume a number of undefined terms which will be defined by their use in our theory. In the following definitions, using wellknown practice from Computer Science, we could use the word ‘widget’ to emphasize, with Hilbert, that our undefined terms can represent anything which behaves inthe prescribed way. Having made this point, we will actually use the undefined but slightly more suggestive terminology of ‘Lists of Lists’ or LoLs.

1.2.1 A wee beastie’s reality, the interpreter

Definition 2.1.1. Any wee beastie is a ‘triple’ of LoLs, called the ‘data’ ‘stack’, the ‘process’ ‘stack’ and the ‘definition’홹횘횢홻횘홻 LoL, respectively. At each ‘computational instant’, the wee beastie takes the ‘top’ LoL off of the ‘process’ ‘stack’ and ‘interprets’ it by either:

1. Reserved word: If the ‘top’ LoL is a ‘reserved’ ‘word’, as defined in any of the four definitions, 2.1.1, 2.2.1, 2.3.1, or 2.4.1, then the beastie makes the described changes to the ‘data’ and ‘process’ ‘stacks’.

2. Known word: If the ‘top’ LoL is a ‘known’ ‘word’, then the wee beastie ‘push’es the ‘definition’ of the known word onto the ‘top’ of the ‘data’ ‘stack’ andthe ‘interpret’ ‘word’ onto the ‘top’ of the ‘process’ ‘stack’.

3. Otherwise: The wee beastie ‘push’es the LoL onto the ‘top’ of the ‘data’ ‘stack’.

35 Hilbert’s Program

35 35 36 36

What can a ‘wee beastie’ do? 1.2.2

These LoLs are important. For our wee beasties, every thing is a LoL. Since we □ are considering the computational basis of ‘Reality’, we will assume that any wee beastie is a computational entity. At any one instant of ‘time’, this wee beastie (or its associated computation) will be in a particular ‘state’. For our purposes, these states will be LoLs. At the same time these LoLs will also capture the ‘textual’ description of any given program. While we could use any programming language, we will base our computational foundations on the language which is in turn loosely based upon Manfred von Thun’s concatenative stack based language, joy16. While most programming languages are applicative and make홹횘횢홻횘홻 extensive use of variables, we will use a semantically simpler concatenative stack based language. Classical commentary In classical terms the category of computations of an applicative functional lan- guage, such as Haskell, corresponds to a Topos, and that of a concatenative stack based functional language, such as or joy, assumes no more structure than that of a Category. Not surprisingly, we will, eventually, show that the computa- tional power of is equivalent홹횘횢홻횘홻 to that of any other programming language. In classical terms this means that the Category of ’s computations actually has the structure홹횘횢홻횘홻 of a Topos (or more correctly a Topos-CoTopos pair). 홹횘횢홻횘홻 In fact to make semantically even simpler, we explicitly expose the stack of process ‘continuations’17. This means that the ‘instantaneous’ ‘state’ of any computation,홹횘횢홻횘홻 is captured by a pair of LoLs interpreted as the ‘data stack’ and ‘process stack’ respectively. 홹횘횢홻횘홻The interpreter then in each computational ‘instance’, takes one LoL off the top of the ‘process stack’ 홹횘횢홻횘홻 1.2.2 What can a ‘wee beastie’ do?

We begin our trilogy of definitions by considering what a wee beastie can do to its environment.

Definition 2.2.1. We assume any wee beastie exists in a ‘collection’ of ‘List of Lists’ (LoLs) represented itself as a LoL. With these LoLs, a wee beastie can:

16 To better understand Manfred von Thun’s programming language, joy, see [Thu94b] for an overview of the language and [Thu94a] for an overview of the mathematical foundations of the language. The ‘original’ concatenative stack based programming language was Forth. See [Bro03] and [Bro04] for introductions to Forth. 17 For a good, early, introduction to the use of continuations to capture the semantics of programming languages, see chapter 5 of [Gor79].

Hilbert’s Program 36

36 36 37 37

1.2 Of Wee Beasties and Idealized Mathematicians

1. Atomic actions

1. Constructive actions

1. ‘nil’ action: create a new ‘top’ ‘nil’ LoL.

{} ( nil ) { dataStack top isNil }

2. ‘cons’ action: create a new ‘top’ LoL by removing the ‘top’ two LoLs and ‘cons’ing them together.

{ x y } ( cons ) { (x y) }

2. Destructive actions

1. ‘pop’ action: delete the ‘top’ LoL by ‘pop’ing it off the stack.

{ x } ( pop ) {}

2. ‘carCdr’ action: create two new ‘top’ LoLs by removing the ‘top’ LoL and replacing it with both the ‘car’ and ‘cdr’ of the original LoL.

{ (x y) } ( carCdr ) { x y }

3. Data Stack manipulation

1. ‘dup’ action:‘dup’licate the ‘top’ LoL.

{ x } ( dup ) { x x }

2. ‘swap’ action:‘swap’ the ‘order’ of the ‘top’ two LoLs.

{ x y } ( swap ) { y x }

4. Process Stack manipulation

37 Hilbert’s Program

37 37 38 38

What can a ‘wee beastie’ do? 1.2.2

1. ‘interpret’ action:

2. ‘choice’ action:

3. ‘repeat’ action:

4. ‘test’ action:

5. Definition list manipulation

1. ‘define’ action:

2. Action Combinators

1. composition: the action which ‘performs’ a pair of actions one followed by the other. If and are two actions, then the ‘composition’ action is explicitly denoted as ‘ ’ or more simply as ‘ ’. 푎 푏 2. non-deterministic choice: The action which ‘performs’ one or other of a 푎;푏 푎 푏 pair of actions. There is, however, no pre-determined reason for the choice. If and are two actions, then the ‘(non-determinitsic) choice’ action is denoted ‘ ’. 푎 푏 3. non-deterministic repeat: The action which performs an action multi- 푎‖푏 ple times before ‘completing’. However, the number of repetitions is not pre-determined. If is an action, then we denote the ‘(non-deterministic) repetition’ action as ‘ ’.

푎 ⋆ 4. test: The action which ‘completes’ if a given test18 succeeds. If is a test 푎 then we denote the ‘test’ action by ‘ ’. 푡 This is all that any wee beastie, or (Idealized) Mathematician can do to LoLs in its 푡? environment19.

Classical commentary □ The atomic actions provide the only actions that a wee beastie can perform ‘directly’ on its environment. All other actions are formed via various action com- binators from these four atomic actions and, via the ‘test’ action, any of the tests defined below. The pair of definitions, 2.2.1 and 2.3.1, are essentially Dynamic Logic with the addition of the least, , and greatest, , fixed point operators taken from -modal logic. See [HKT00] and [DGL16] respectively for a more in-depth discussion. 휇 휈 휇 18 Tests are defined in Definition 2.3.1. 19 In Volume II, we will add one more thing that a transfinite wee beastie can do.

Hilbert’s Program 38

38 38 39 39

1.2 Of Wee Beasties and Idealized Mathematicians

Notice our explicit use of John McCarthy’s notation for the programming lan- guage LISP, [MAE+65]. Since the definitions 2.2.1, 2.3.1, and 2.4.1 are co-dependent, we postpone any further commentary until Section 1.1.3.4.

1.2.3 What can a ‘wee beastie’ know?

Definition 2.3.1. Again, we assume that any wee beastie exists in a ‘collection’ of LoLs. With these LoLs, a wee beastie can preform (compute) the following tests:

1. Atomic Tests

1. The test which always succeeds, denoted ‘true’ or ‘ ’.

2. The test which always fails, denoted ‘false’ or ‘ ’. ⊤ 3. The test which succeeds if the ‘top’ LoL is ‘nil’, denoted ‘isNil’. ⊥ 2. Static Test Combinators

1. The test which succeeds if given pair of other tests both succeed. If and are two tests, then the ‘conjunction’ test is denoted ‘ ’. 푠 푡 2. The test which succeeds if one or more of a given pair of other tests succeeds. 푠 ∧푡 If and are two tests, the ‘disjunction’ test is denoted ‘ ’.

3. The test which succeeds if a given other test fails. If is a test, then the 푠 푡 푠 ∨푡 ‘negation’ test is denoted ‘ ’. 푡 3. Dynamic Test Combinators ¬ 푡 1. The test which succeeds if a given other test succeeds after all possible com- putations of a given action. If is a test and is an action, then the ‘necessity’ test is denoted ‘ ’. 푡 푎 2. The test which succeeds if a given other test succeeds after at least one [푎]푡 possible computation of a given action. If is a test and is an action, then the ‘possibility’ test is denoted ‘ ’. 푡 푎 3. The test which succeeds if TODO: complete the description of the ⟨푎⟩푡 operator. 휇 4. The test which succeeds if TODO: complete the description of the operator. 휈 39 Hilbert’s Program

39 39 40 40

Identity and Bisimulation 1.2.4

If any of the above tests does not succeed, then that test fails. This is all that any wee beastie (or Idealized Mathematician) can know by testing LoLs in its environment.

Classical commentary □ The atomic tests provide the only tests that a wee beastie can perform ‘directly’ on its environment. All other tests are formed via various test combinators from these three atomic tests, and, via the ‘necessity’ and ‘posibility’ tests, any actions defined above. Of these three atomic tests, only the third atomic test increases the wee beastie’s knowledge of its environment. Neither of the first two atomic tests reference the ‘state’ of the wee beastie’s environment. The three static test combinators correspond to the core of classical (static) propositional logic. Note that we do not assume the classical ‘principle of the excluded middle’, hence we must define both the ‘logical’ ‘conjunction’ (‘and’) as well as the ‘logical’ ‘disjunction’ (‘or’) operators. It will turn out that while the principle of the excluded middle is valid for any finitely defined collection of (non- )well-founded LoLs, it need not be valid for ‘asymptotically’ defined collection (non- )well-founded LoLs. Classical logic, including propositional, first or higher order, concerns itself with ‘properties’ of a single fixed ‘unchanging’ ‘world’. Since there is no change, there is no corresponding sense of ‘time’. Wee beasties, such as ourselves, experience a ‘world’ of constant change. While we will not (yet) explicitly define time, as wee beasties, we are naturally interested in what will happen to our world if we change it. The dynamic test combinators capture how given tests will change if a wee beastie changes its environment by first performing one of its possible actions on its environment. As mentioned above the pair of definitions, 2.2.1 and 2.3.1, are essentially Dy- namic Logic, [HKT00], augmented with the and operators taken from -modal logic, [DGL16]. 휇 휈 휇

1.2.4 Identity and Bisimulation

When can a wee beastie know that two LoLs, in its environment, are the same LoL? Given the capabilities listed above, a wee beastie can never know if two LoLs are identical, the same or equal. The best a wee beastie can do is to compare a pair of LoL’s using the same collection of tests. TODO: If we have a proof of the well-foundedness of a given collection of tests, then we can talk about the identity of a given well-founded object. The critical thing here is that since everything is well-founded, we know that the computation of identity will terminate with either an affermation or a denial of the claim of identity.

Hilbert’s Program 40

40 40 41 41

1.2 Of Wee Beasties and Idealized Mathematicians

Definition 2.4.1. Two LoLs are Bisimilar if they pass the same collection of tests.

Classical commentary □ This concept of Bisimulation has been identified by a number of Mathematicians and Computer Scientists. From the theory of non-well-founded sets, see [Acz88]. From the theory of co-algebras, see Chapter 3 of [Jac17].

Finally, linking the above three definitions together we make the following as- sertion about any wee beastie and its environment of LoLs:

Assertion 2.4.2.

1. ‘car’ of the ‘nil’ LoL is bisimular to ‘nil’. TODO: Is this correct?

2. ‘cdr’ of the ‘nil’ LoL is bisimular to ‘nil’. TODO: Is this correct?

3. ‘car’ of the ‘cons’ of two LoLs is bisimular to the first LoL.

4. ‘cdr’ of the ‘cons’ of two LoLs is bisimular to the second LoL.

5. ‘cons’ of the pair of LoLs obtained by ‘car’ and ‘cdr’ of the same LoL, is bisimular to the original LoL.

1.2.5 Rigour □

TODO: How do we program the verification engine? For (Basic) Dynamic Logic we build the minimal/maximal filtration (see [DGL16] section 5.5.1 page 124; see also [HKT00] section 5.4 page 171 (for PDL) and section 11.4 page 297 (for first-order))(is this essentially a tableau?). I suspect we play the two person games on this underlying filtration to prove satisfiablity of -modal logics, see [DGL16] section 15.4 page 682 AND preceeding sections. How does the above inter-relate with [HKT00] deductive logic?휇 Are these simply moves in the construction of the filtration? Do we extend this deductive logic to allow for bisimularity assertions? (See below) TODO: What is the structure of the pre and post conditions? We do NOT want to use meta-variables, so we use deBruijn indexing. However we may have complex repeated expressions, how do we identify these? Do we ‘assign’ them to ‘(meta)-variables’ or do we define them as ‘sym- bols’? Use of meta-variables would allow ‘pattern matching’ (similar to Haskell), however this would require a unification engine adding to the

41 Hilbert’s Program

41 41 42 42

Texts: transcribing Lists of Lists 1.2.A

base ’s complexity. I suspect meta-variables should be left for some future extension. TODO:홹횘횢홻횘홻 How do we deal with identity inside a condition and between pre and post홹횘횢홻횘홻 conditions? Clearly we have no identity, but we can assert bisimularity. Bisimularity between pre and post conditions requires the ability to ‘reference’ both data/process/definition LoLs in both pre/post ‘states’. As a result of these assertions, the only ‘meaningful’ tests are those which consist of collections of ‘car’s and ‘cdr’s. Assertions 3 and 4 imply that any ‘cons’s can be undone by appropriate use of ‘car’s and ‘cdr’s. TODO: Is this a meaningful observation?

The following Appendices to this chapter provide detailed specifications of as- pects of a wee beastie’s environment which, while required by the definitions above, are of a more technical nature. Example implementations of these specifications can be found in the main Appendix ?? to this book.

1.2.A Texts: transcribing Lists of Lists

To be useful, any program needs to be written down in a format which can be stored over time and transmitted through space. The traditional way to do this is to transcribe a홹횘횢홻횘홻 program to and from a ‘string of characters’. This chapter appendix specifies how this is done. While it might seem홹횘횢홻횘홻 strange to worry about texts and transcription of textual artefacts in a mathematical document, any formal foundation of mathematics does start by listing the collection of allowed symbols, see for example [Kle09]. In par- ticular, we could develop the whole of mathematics using just two symbols: ‘(’, and ‘)’. However, to be kind to ‘bears of very little brain’ (like myself), we will use, at least initially, a richer collection of symbols. In line with the typical practice of Computer Science, we parse tree structures of symbols from texts as sequences of characters. This enables the collection of meaningful symbols to expand as needed for a particular discussion.

1.2.A.1 Characters

A character is a fundamental representation of a glyph of one or other human language. ‘Printed’ characters are currently used by humans, to communicate ideas between themselves. For our purposes, a character is an indivisible entity. The collection of characters has an associated order relationship. So we know when a given character is ordered before, equal to, or ordered after another given character. In current computational systems, characters are usually implemented using the Unicode consortium’s ‘UTF-8’ specifications. All we will assume in this document,

Hilbert’s Program 42

42 42 43 43

1.2 Of Wee Beasties and Idealized Mathematicians

is that the few characters we actually need, and which are explicitly ‘named’ below, have the ‘standard’ ASCII ordering. The code in this document will only ever use ASCII characters. However since the ASCII characters are explicitly the initial sub-set of the UTF-8 characters, this홹횘횢홻횘홻 presents no particular problems for future work. Characters are often sub-divided into specific ‘useful’ sub-collections. For our purposes, we will subdivide our character set into ‘white space’, ‘structural’ and ‘symbol’ characters.

Definition 2.A.1. The white space characters consist of the ASCII characters less than or equal to the ‘ ’ (space) character. Formally, isWhiteSpace? is defined to be:

{ dataStack top isCharacter? } ( \space lessThanEqual ) { dataStack top isBoolean? }

TODO: How do I show that the same character is still on top of the data stack. The current assertions are active. Stating that we have the ‘same’ character is a rather more passive statement. Should I use a stack description language similar to that used by [Dig08]? The structural characters consist of the ASCII characters, ‘(’, ‘)’, ‘{’, ‘}’, ‘"’, ‘’’, and ‘\’. Formally, isStructuralCharacter? is defined to be:

{ dataStack top isCharacter? } ( ( "(" isEqualCharacter? ) ( ")" isEqualCharacter? ) ( "{" isEqualCharacter? ) ( "}" isEqualCharacter? ) ( '"' isEqualCharacter? ) ( "'" isEqualCharacter? ) ( "\" isEqualCharacter? ) or ) { dataStack top isBoolean? }

The symbol characters consist of all non-white space and non-structural characters. Formally, isSymbolCharacter? is defined to be:

{ dataStack top isCharacter? } ( ( isWhiteSpace? not) ( isStructuralCharacter? not) and

43 Hilbert’s Program

43 43 44 44

Texts: transcribing Lists of Lists 1.2.A

) { dataStack top isBoolean? }

1.2.A.2 Symbols □

TODO: explodeSymbol, createSymbol, and isSymbol? are all core meth- ods.... how and where is this shown?

Definition 2.A.2. A symbol is an object which can be created from, or exploded to, a sequence of characters. Structural symbols, when exploded, consist of a single structural character. For- mally we define isStructuralSymbol to be:

{ dataStack top isSymbol? } ( ( explodeSymbol listLength 1 numbersEqual ) ( explodeSymbol isStructuralCharacter? isListOf? ) and ) { dataStack top isBoolean? }

Non-structural or atomic symbols, when exploded, consist of a sequence of arbi- trary fixed length, which does not include any white-space or structural characters. Formally we define isAtomicSymbol to be:

{ dataStack top isSymbol? } ( ( explodeSymbol ( isWhiteSpace? isStructuralCharacter? or not ) isListOf? ) ) { dataStack top isBoolean? }

Symbols are ordered corresponding to the lexicographical ordering of their exploded character sequences. Formally we define symbolIsLessThan? to be:

{ (dataStack top isSymbol?) (dataStack top2 isSymbol?) and

Hilbert’s Program 44

44 44 45 45

1.2 Of Wee Beasties and Idealized Mathematicians

} ( explodeString explodeString isCharacterLessThan? isListOf? ) { isBoolean? }

TODO: The above code is critically missing the required swaps and duplicates!!! Do the other comparison cases! 홹횘횢홻횘홻

1.2.A.3 Texts □

A text is a list or sequence of characters.

1.2.A.4 Parsing

Parsing is the process of extracting a ‘parse tree’ of symbols from a text.

1.2.B Containers

In our specification of wee beasties we assumed a number of specifications of the containers, Lists, Stacks and Dictionaries. This chapter appendix provides the full specifications for each of these concepts. In Computer Science, a container is a structure which contains other structures. Usually, but not always, all of the contained structures are of the same (super) type.

1.2.B.1 Lists

A list is a container which contains a sequence of LoLs.

1.2.B.2 Stacks

A stack is a container which only allows access via the push and pop methods. A typical implementation would be using a list. If a dictionary is known to be well- founded, then it is easy to specify that the dictionary has exactly one instance of any key, value pair. If a dictionary is potentially non-well-founded, it is impossible to implement this uniqueness constraint. This means that the concept of ‘dictionary’ has both well-founded, dictionary, as well as a non-well-founded, multi-dictionary, variants.

45 Hilbert’s Program

45 45 46 46

Containers 1.2.B

1.2.B.3 Dictionaries

A dictionary is a container which implements a key to value mapping. It allows you to find a value given its key, or to add a key, value pair. Similar to a dictionary, if a set is known to be well-founded, then it is easy to specify that the set has exactly one instance of any element. However, if a set is potentially non-well-founded, it is impossible to implement this uniqueness constraint. This means that the concept of ‘set’ has both well-founded, set, as well as a non-well-founded, multi-set, variants. Similarly, we have well-founded powersets as well as potentially non-well-founded multi-powersets respectively.

1.2.B.4 Sets and Power Sets

Hilbert’s Program 46

46 46 47 47

1.3 Computational examples

1.3 Computational examples

1.3.1 Tree of binary processes

1.3.2 Power multi-set

1.3.3 Coalgebra

TODO: Develop the co-algebraic isomorphism using the power multi- set functor.

1.3.4 Co-Lawvere Theorem

47 Hilbert’s Program

47 47 48 48

Co-Lawvere Theorem 1.3.4

Hilbert’s Program 48

48 48 49 49

1.4 Formal System

1.4 Formal System

By its very nature, the process of founding Mathematics is circular. All previ- ous attempts at founding Mathematics, have quietly assumed Mathematical Logic, in one form or another, as a given. Less obviously, all previous foundations of Mathematics, have also implicitly assumed that something can compute the logical validity of a proof, and hence have implicitly assumed the Mathematical theory of computation. Since we can, “from moment to moment, see that we have made marks in the sand”, we will simply assume that we can compute. But if we are going to found Mathematics by Computing things, how do we know that what we have computed is really what we set out to compute? How do we compute the validity of what we have computed? What do we mean by the validity of what we have computed? Previous attempts to found Mathematics answered their equivilant questions using Logic. Talk about Mathematicians as (Rigorous) (Computer) Language designers... mathematicians design languages with which to computer interesting ‘results’ in a rigorous way. Talk about intuition, classical proofs, computational proofs. Talk about proof relevant mathematics and proof existent mathematics. Talk about how classical/intuitionistic constructive/non-constructive debate misses the point. All of mathematics is one.. the real question is which "collections" of mathematical objects admit proofs of the various types. Recursive sets are classi- cal, Recursively enumerable sets are intuitionistic. non-constructive proofs require unbounded search using axiom of choice. Talk about proofs as computation traces. Proof sketches provide way-points in any given computation traces. Existence of a proof-computation-trace "proves" a theorem, but "having" computational traces, or various way-points, provide far more information for the purposes of both "understanding" and "generalization". The "base space" of computational traces form a category, while the fibers of logic or type theory, form a topos. Essentially this is because JoyLoL computational traces can be concatenated but the logic "about" the computational traces, must identify "things" "inside" each computation. Hence we do not use variables in the base space, but we do in the fibres. non-total computation is formally based upon the ideas from lindstrom1989non- WellFoundedSetsMartinLofTypeTheory NOTE that the uniformity of the computa- tion from step to step is critical to providing a well defined computation at infinity. Object Language Meta-Language We need the ability to specify (sub)collections of ‘computational results’.

49 Hilbert’s Program

49 49 50 50

Mappae Mundi 1.4.1

A Tale of many languages

talk about joylol and joylol0 talk about conservative extensions talk about mathematicians/programmers versus sceptical philosophers/business analysts talk about intuition being operationalized... how do we tell if we have opera- tionalized correctly? talk about intuitive "ordinals", "" Need to "prove" something simple Define DictNode:

typedef struct joylol_object_struct { JClass *type; TagType tag; FlagsType flags; // an arbitrary collection of bits } JObj;

typedef struct dictNode_object_struct { JObj super; Symbol *symbol; JObj *preObs; JObj *value; JObj *postObs; DictNodeObj *left; DictNodeObj *right; DictNodeObj *previous; DictNodeObj *next; size_t height; long balance; } DictNodeObj;

1.4.1 Mappae Mundi

Our foundation of mathematics is best understood as an, alas, fairly complex multi-induction of a Categorical natural transformation between a pair of functors. While we do not require Category theory for this induction, using ideas from Cat- egory theory helps us navigate through this complexity. In effect Category theory provides one (of many) mappa mundi. The ideas required to navigate through this multi-induction, comes from Bart Jacob’s pair of books [Jac99] and [Jac17] combined with Harel, Kozen and Tiuryn’s [HKT00] as well as Demri, Goranko and Lange’s [DGL16].

Hilbert’s Program 50

50 50 51 51

1.4 Formal System

Coalgebras, bisimulations, modal logic, invariants and liftings of both relations and predicates, permeate everything we do, [Jac17]. In particular Modal logics via the predicate lifting approach using the natural transformation between a pair of functors, see Definition 6.5.1 [Jac17]20, is particularly important. However the modal logic that we are after is Dynamic Logic, see sections 5.2 and 11.3 [HKT00]. This ‘modal logic’ actually requires the lifting of both predicates and relations. The critical complication comes from our need to ‘prove’ that the natural trans- formation between the pair of functors, which is a computation, actually computes what we say it computes. To do this we will make use of the tools found in [DGL16] culminating in the ‘Satisfiability-Checking Game for ’ found in section 15.4.1. Central to this is the use of the least and greatest fixed point operators, and , in 휇 a -modal logic variant of Dynamic Logic. We will use퐿 ‘Lawvere’s mating rule’ de- finition of and as described in Jacob’s Lemma 4.1.8 [Jac99]. However,휇 since휈 we are휇 defining a concatenative computational language, instead of forming fibres over sets of variables∀ we∃ fibre over lists. The fibres are then all the possible ‘valuations’ of a given list. In all of this, the critically important thing is that the syntax of the Dynamic (modal) logic is interpreted in the semantics of the coalgebra of lists of lists. In the literature21, the ‘predicate lifting’ approach, using a natural transformation between a pair of functors, while syntactically more concrete, is seen as slightly more arbitrary in that there are potentially more choices of syntactical functors. While strictly not necessary, we will follow the more concrete syntactical ‘predicate lifting’ approach to make it easier for mere mortals, such as ourselves, to follow the construction. Part of this construction will however emphasize how each particular syntactical structure is written in a more formal list of lists format. In all cases, the functors we construct, are computable functions written in the language which is the chosen syntax. We ‘implement’ this language in two ways. Firstly the natural transformation is the implementation (interpreter)홹횘횢홻횘홻0 of the syntax, , in the semantics of list of lists. However,홹횘횢홻횘홻 to0 provide a portable interpreter of the language, we annotate the natural transformation with small chuncks홹횘횢홻횘홻0 of ANSI-C code, which when collectively com- piled using an ANSI-C compiler, provide홹횘횢홻횘홻 an0 executable which can interpret any code on any modern computer. In particular, this means that interprets itself. 홹횘횢홻횘홻As well0 as annotating the natural transformation with ANSI-C code, an홹횘횢홻횘홻 integral0 part of the code which implements the natural transformation, is a running Hoare logic description of the pre and post conditions of each computational step. In order to keep홹횘횢홻횘홻 the complexity0 of this construction to a minimum, we keep the initial interpreter (the natural transformation) as simple as possible. However, this means that the interpreter is not able to make even obvious inferences. 홹횘횢홻횘홻0 0 20 See also [KP11] discussion홹횘횢홻횘홻 of the same ideas. 21 See for example, [Jac17] section 6.5 and [KP11].

51 Hilbert’s Program

51 51 52 52

Mappae Mundi 1.4.1

Once has been accepted as correct, future, more sophisticated, versions of can be built based upon . In particular, these more sophisticated versions홹횘횢홻횘홻 of 0 can be given greater inferential power with which to make the 홹횘횢홻횘홻proofs of computational correctness홹횘횢홻횘홻 simpler.0 For the rest홹횘횢홻횘홻 of this subsection we sketch, using a high-level, classical, Categorical point of view, the construction of the interpreter. Being mappae mundi, each of the classical parts of this sketch are merely suggestive of the actual construc- tion we will need. However, in each case,홹횘횢홻횘홻 these0 mappae mundi provide useful points of reference with which to orient ourselves as we build our actual construction. To help keep the parallels to mind we will keep the ‘names’ of each ‘component’ part we sketch here, in the final construction below. We begin by taking six ‘component’ parts from Definition 6.5.1 of [Jac17]:

1. A functor, LoL LoL, which ‘defines’ the Coalgebra of Lists of Lists. This coalgebra of Lists of Lists, is the semantics of the computational language. 퐹 : → 0 2. The terminal -Coalgebra LoL LoL LoL . We assert홹횘횢홻횘홻 that this terminal coalgebra actually exists.

3. A functor, 퐹 LoL LoL,( which,푐 : ‘defines’→퐹( the) Algebra of the -modal Dy- namic Logic which we call . This algebra is the syntax of the computational퐿 : language.→ 휇 0 0 4. The initial -Algebra L 홹횘횢홻횘홻L L . 홹횘횢홻횘홻

5. A functor, LoLop LoL, which is the contravariant ‘powerset’ 퐿 ( ,훾 : 퐿( ) → ) functor. (−) 6. A natural transformation,푃 = 2 : → , which is the interpreter of the computational language. 훿 : 퐿푃 ⇒푃퐹 홹횘횢홻횘홻0 Following Chapter 2 of [Jac99], we can view as essentially a ‘simple type theory’ with the single basic type, LoL. Contexts are important. For us a context is actually a (data) stack description, see [Dig08홹횘횢홻횘홻]. This0 means that the ‘rules’ of context (stack) manipulation will be slightly different for our case. TODO: describe the classifying category Definition 2.1.1 [Jac99]... is essentially ‘programs’. The arrows are the process stack while the objects are the data stack. TODO: explore홹횘횢홻횘홻0 the and -calculi described in section 2.3 [Jac99]. Comment on the fact that propositions as types are programs. × Where do the type formers휆1 fit휆1 in to our work? TODO: In the terminology of [Jac99], the -modal홹횘횢홻횘홻 Dynamic0 Logic is really the simultaneous addition of a Simple Type Theory (the program 휇 Hilbert’s Program 52

52 52 53 53

1.4 Formal System

part) and a First-order predicate logic (the predicate part) at the same time. See Chapters 2 and 4 [Jac99]. In particular see sections 4.2.4 through 4.2.7 of [Jac99]. We will retain Jabob’s structured contexts by having a stack description as well as a separate collection of predicate in every context. We will also be interested in the functorial semantics described in section 2.2 of [Jac99]. Since we are building a fixed point of the functorial semantics operator ... TODO: describe the structure of our Dynamic Logic sections 5.1, 5.2, 11.1, 11.2, and 11.3 in [HKT00] What is a set? Computationally there are many ways to implement a ‘set’. The easiest is to simply list all of the elements, once for each element. For data it is easy to ensure a given element is contained in the list exactly once. While we will show that it is possible to sort a process, and hence remove duplicates, it is equally easy to realize that one can only remove all duplicates in the infinitely sorted limit. Any particular finite sorted approximation to the infinite sort, may always have multiple copies of any given element. This suggests that ‘mulit-sets’ or ‘bags’ are more natural computataional equivalent to the axiomatic concept of ‘set’. TODO: Find references for multi-sets. What then is the computational equivalent of the ‘power-set’. It should be the ‘power-multi-set’. TODO: While the collection of all sub-multi-sets of a given multi-set can include arbitrary multiplicities, we should define a sub-multi-set to only have multiplicities less than or equal to the mul- tiplicities of the parent multi-set. We can then define a characteristic function to choose/reject a given multi-set element. Then the power- multi-set is the collection of all of these characteristic functions. TODO: discuss functions based upon interpreted code OR lists. Finally, at the moment, we will only provide the base case of -computational foundations. The additional computational ‘axiom’ required to get us from the base case to all of mathematics is simple and from a classical compuational휔 point of view, fairly evident, but will require some philosophical reflection. We leave this computational ‘axiom’ until much later in this document where we can give it the careful consideration it requires. TODO: Provide a selection of useful references to Category/Topos theory

1.4.2 Semantic Functor

Classical commentary We begin by building the Semantic Functor, , together with its associated -Coalgebra map, LoL LoL . Using the notation from Section 2.2.2 in [Jac17] on ‘Binary -Trees’, our functor will be essentially퐹 LoL LoL LoL. 퐹 푐 : → 퐹( ) 퐴 퐹 퐹 : → × 53 Hilbert’s Program

53 53 54 54

Syntaxtic Functor 1.4.3

1.4.3 Syntaxtic Functor

We now proceed by building the Syntaxtic Functor, , together with its associ- ated -Algebra map, L L. TODO: While at the same time we will build the퐿 natural transforma- tion,퐿 .훾 : 퐿( ) → TODO: We need to build the injection from L LoL. 훿 : 퐿푃 ⇒푃퐹 →

Hilbert’s Program 54

54 54 55 55

2 Implementing in

In this part, we provide a full, but very simple, implementation of in itself22. We will provide홹횘횢홻횘홻 full proofs of this홹횘횢홻횘홻 implementation’s correctness. To keep this implementation simple, we use very simple container implemen-홹횘횢홻횘홻 홹횘횢홻횘홻tations which are unlikely to be performant for large scale use. Our companion document, [Gai17], provides a slighly more performant implementation using the ANSI-C programming language wrapped in Lua and loadable into LuaTEX (and hence into ConTEXt). It is this implementation, loaded into ConTEXt, which has been used to formally check the correctness of all assertions in this document.

22 There is a very long tradition of writing computer languages in themselves, see for example the original implementation of Lisp in Lisp, [MAE+65], or the very first C-compiler for the earliest Unix, [].

55 Hilbert’s Program

55 55 56 56

Hilbert’s Program 56

56 56 57 57

2.1 Abstract Syntax Trees

2.1 Abstract Syntax Trees

57 Hilbert’s Program

57 57 58 58

Hilbert’s Program 58

58 58 59 59

2.2 Parser

2.2 Parser

59 Hilbert’s Program

59 59 60 60

Hilbert’s Program 60

60 60 61 61

2.3 Type inferencer

2.3 Type inferencer

61 Hilbert’s Program

61 61 62 62

Hilbert’s Program 62

62 62 63 63

2.4 Verifier

2.4 Verifier

63 Hilbert’s Program

63 63 64 64

Hilbert’s Program 64

64 64 65 65

2.5 Interpreter

2.5 Interpreter

65 Hilbert’s Program

65 65 66 66

Hilbert’s Program 66

66 66 67 67

Bibliography

Acz88 Aczel, P. (1988). Non-well-founded sets. (Vol. 14). From CSLI Stanford website via Wikipedia.

AR94 Adamek, J. & Rosicky, J. (1994). Locally Presentable and Accessible Cate- gories. (No. 189). Cambridge University Press. Retrieved from http://www .cambridge.org/catalogue/catalogue.asp?isbn=9780521422611

Awo09 report: [author: Awodey, Steve] [institution: Department of Phi- losophy, Carnegie Mellon University] [title: From Sets to Types to Categories to Sets] [type: preprint] [url: http://www.an- drew.cmu.edu/user/awodey/preprints/stcsFinal.pdf] [year: 2009]

Bar93 Barr, M. (1993). Terminal coalgebras in well-founded set theory. doi:10 .1016/0304-3975(93)90076-6

Bar94 (1994). Additions and corrections to “Terminal coalgebras in well- founded set theory”. doi:10.1016/0304-3975(94)90060-4

BM96 Barwise, J. & Moss, L. S. (1996). Vicious Circles. (No. 60). CSLI Publi- cations. Retrieved from https://web.stanford.edu/group/cslipublications /cslipublications/site/1575860082.shtml

BP07 Beck, J. M. & Pouget, A. (2007). Exact Inferences in a Neural Implemen- tation of a Hidden Markov Model. neuralComp, 19, 1344-1361.

Bro03 Brodie, L. (2003). Starting FORTH (M. Hendrix, Ed.). forth.com. Re- trieved from https://www.forth.com/starting-forth/

Bro04 (2004). Thinking FORTH. thinking-forth.sourceforge.net. Retrieved from http://thinking-forth.sourceforge.net/

CS09 Coquand, T. & Spitters, B. (2009). Integrals and Valuations. ArXiv, (808).

DGL16 Demri, S., Goranko, V., & Lange, M. (2016). Temporal Logics in Computer Science: Finite-State Systems. (Vol. 58). Cambridge University Press. doi :10.1017/CBO9781139236119

Den87 Dennett, D. C. (1987, 10). The Intentional Stance. MIT Press.

Dig08 report: [author: Diggins, Christopher] [institution: www.cdiggins.com] [pagetotal: 12] [title: Simple Type Inference for Higher-Order Stack-Oriented Languages] [type: Technical Report] [url: http://citeseerx.ist.psu.edu/view- doc/summary?doi=10.1.1.156.406] [year: 2008]

67 Hilbert’s Program

67 67 68 68

DI08 Doering, A. & Isham, C. J. (2008). `What is a Thing?’: Topos Theory in the Foundations of Physics. ArXiv, (803). Retrieved from http://arxiv.org /abs/0803.0417

DIP+07 Doya, K., Ishii, S., Pouget, A., & Rao, R. P. N. (2007). Bayesian Brain: Probabilistic Approaches to Neural Coding. MIT Press. Retrieved from https ://mitpress.mit.edu/books/bayesian-brain

Ewe04 Ewert, J. P. (2004). Motion Perception Shapes the Visual World of Am- phibians. In F. R. Prete (Ed.), Complex Worlds from Simpler Nervous Sys- tems. MIT Press. Retrieved from https://mitpress.mit.edu/books/complex -worlds-simpler-nervous-systems

Gai17 report: [author: Gaito, Stephen T] [institution: PerceptiSys Ltd] [title: Implementing JoyLoL] [type: Technical Report] [year: 2017]

Gor79 Gordon, M. J. C. (1979). The Denotational description of Programming Languages: An introduction. Springer-Verlag.

HKT00 Harel, D., Kozen, D., & Tiuryn, J. (2000). Dynamic Logic. MIT Press. Retrieved from https://mitpress.mit.edu/books/dynamic-logic

Hei67 Heidegger, M. (1967). What is a thing?. Regenery/Gateway. Retrieved from http://www.worldcat.org/title/what-is-a-thing/oclc/000178558

HLS09 Heunen, C., Landsman, N. P., & Spitters, B. (2009). A topos for algebraic quantum theory. ArXiv, (709).

Jac06 Jackson, M. T. (2006, 4). A Sheaf Theoretic Approach to Measure Theory. (PhD).

Jac99 Jacobs, B. (1999). Categorical Logic and Type Theory. (Vol. 141). Elsevier.

Jac17 (2017). Introduction to Coalgebra: Towards Mathematics of States and Observation. Cambridge University Press. doi:10.1017 /CBO9781316823187

Kle09 Kleene, S. C. (2009). Introduction to Meta-Mathematics. ISHI Press Inter- national. (Reprint of the 1952 version published by North-Holland)

Kli11 Klin, B. (2011). Bialgebras for Structural Operational Semantics:. doi:10 .1016/j.tcs.2011.03.023

KP04 Knill, D. & Pouget, A. (2004). The Bayesian brain: the role of uncertainty in neural coding and computation. Trends in Neuroscience, 27(12), 712-719. doi:10.1016/j.tins.2004.10.007

Hilbert’s Program 68

68 68 69 69

KP11 Kupke, C. & Pattinson, D. (2011). Coalgebraic semantics of modal logics: an overview. doi:10.1016/j.tcs.2011.04.023

Law75 Lawvere, F. W. (1975). Continuously variable sets: Algebraic Geomery = Geometric logic. In H. E. Rose & J. C. Shepherdson (Eds.), Logic Colloquium ’73. (Vol. 80). Elsevier.

LR03 Lawvere, F. W. & Rosebrugh, R. (2003). Sets for Mathematics. Cambridge University Press.

MAE+65 McCarthy, J., Abrahams, P. W., Edwards, D. J., Hart, T. P., & Levin, M. I. (1965). LISP 1.5 Programmer’s Manual. (2nd Ed ed.). MIT Press.

RG02 Raichle, M. E. & Gusnard, D. A. (2002). Appraising the brain’s energy budget. doi:10.1073/pnas.172399499

RWD+99 Rieke, F., Warland, D., De Ruyter Van Steveninck, R., & Bialek, W. (1999). Spikes: Exploring the Neural Code. MIT Press. Retrieved from https://mitpress.mit.edu/books/spikes

Rus12 Russell, B. (1912). The problems of Philosophy. (No. 35). Henry Holt and Company / Williams and Norgate. Retrieved from https://archive.org/details /problemsofphilo00russuoft

RT94 Rutten, J. & Turi, D. (1994). Initial algebra and final coalgebra semantics for concurrency. In J. W. Bakker, W. P. Roever, & G. Rozenberg (Eds.), A Decade of Concurrency Reflections and Perspectives. (Vol. 803). Springer. doi:10.1007/3-540-58043-3_28

Sha01 Shalizi, C. (2001, 5). Causal Architecture, Complexity, and Self- Organization in Time Series and Cellular Automata. (PhD). Retrieved from http://bactra.org/thesis/

TP97 Turi, D. & Plotkin, G. (1997). Towards a Mathematical Operational Se- mantics. In Towards a Mathematical Operational Semantics., Proceedings of the 12th LICS Conference. IEEE, Computer Society Press. Retrieved from http://www.dcs.ed.ac.uk/home/dt/towards.html

Hei67 van Heijenoort, J. (1967). From Frege to Godel: a source book in mathe- matical logic 1879-1931. Harvard University Press.

Thu94 report: [author: von Thun, Manfred] [institution: Philosophy Depart- ment, La Trobe University] [pagetotal: 14] [title: Mathematical Founda- tions of Joy] [type: preprint] [url: http://www.kevinalbrecht.com/code/joy- mirror/j02maf.html] [year: 1994]

69 Hilbert’s Program

69 69 70 70

Thu94 report: [author: von Thun, Manfred] [institution: Philosophy Depart- ment, La Trobe University] [title: Overview of the language JOY] [type: preprint] [url: http://www.kevinalbrecht.com/code/joy-mirror/j00ovr.html] [year: 1994]

Thu96 report: [author: von Thun, Manfred] [institution: Philosophy Department, La Trobe University] [title: A Joy interpreter written in Joy] [type: preprint] [url: http://www.kevinalbrecht.com/code/joy-mirror/jp-joyjoy.html] [year: 1996]

Wal91 Walley, P. (1991). Statistical Reasoning with Imprecise Probabilities. (Vol. 42). Chapman and Hall.

Hilbert’s Program 70

70 70 71 71

Glossary

71 Hilbert’s Program

71 71 72 72

Hilbert’s Program 72

72 72 73 73

Symbols

t isSymbolCharacter? 43 isAtomicSymbol 44 isWhiteSpace? 43 isStructuralCharacter? 43 symbolIsLessThan? 44 isStructuralSymbol 44

73 Hilbert’s Program

73 73 74 74

Hilbert’s Program 74

74 74 75 75

Subject Index

s symbol 44

75 Hilbert’s Program

75 75 76 76

Hilbert’s Program 76

76 76