COPYRIGHT AND CITATION CONSIDERATIONS FOR THIS THESIS/ DISSERTATION

o Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

o NonCommercial — You may not use the material for commercial purposes.

o ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.

How to cite this thesis

Surname, Initial(s). (2012). Title of the thesis or dissertation (Doctoral Thesis / Master’s Dissertation). Johannesburg: University of Johannesburg. Available from: http://hdl.handle.net/102000/0002 (Accessed: 22 August 2017).

Digital Environment Evolution Modelling and Simulation.

Merrick Kenna Bengis

Thesis submitted in fulfilment of the requirements for the degree

Doctor Philosophiae

in the

Faculty of Science

Academy of Computer Science and Software Engineering

University of Johannesburg

Promotors: Prof EM Ehlers & Prof DA Coulter

January 2020

I. Abstract The concurrent growth of the human population and advancement in technology, together with ever-changing social interaction, has led to the creation of a large, abstract and complex entity known as the Digital Environment. In the current world, the Digital Environment, which is continually growing and ever-evolving, is now almost unrecognisable from what it started off as nearly 50 years ago.

The human population has grown rapidly in the past century, growing to nearly 8 billion people in 2019, already double the population from 1975. This has created a world with more people than ever before, all of whom have a need to communicate with others, share information and form communities.

Technology also experienced unprecedented advancements in this time, with important inventions such as electricity, computational machines, and communication networks. These technologies grew and allowed for people around the world to communicate as if they were next to each other, facilitated by the advent of the Internet. Presently, people all around the world are creating, sharing, and consuming information, while forming online communities, and also growing the physical footprint of the Internet and all connected devices.

The intersection of these events formed the Digital Environment: an amalgamation of the physical, digital and cyber worlds. It is evident how rapidly and completely the Digital Environment has evolved in the past few decades, so what is in store for the future? Can people prepare for what the Digital Environment is to become and possibly even change its course?

This thesis proposes a novel model for the simulation and prediction of the evolution of the Digital Environment: the Digital Environment Evolution Modelling and Simulation model or DEEv-MoS. The DEEv-MoS model proposes a method that makes use of well-developed and commonly used fields of research to create a holistic simulation of the Digital Environment and its many parts.

Through the use of intelligent agents, entity component systems and machine learning, accurate simulations can be run to determine how the future digital landscape will grow and change. This allows researchers to further understand what the future holds and prepare for any eventualities, whether they are positive or negative.

Keywords: Multi-Agent Systems, Machine Learning, Entity-Component-Systems, Digital Environment, Digital Evolution.

i

II. Acknowledgements I would like to thank my thesis supervisors: Prof. E.M. Ehlers and Prof. D.A. Coulter for their continual guidance and support throughout the completion of this dissertation. Without their insight and opinion the completion of this thesis would not have been possible. Their encouragement drove me to work hard on this endeavour and not to let up, while their combined experience and knowledge guided me in delivering my research in the correct manner.

I would also like to thank my family and friends for their support during this time, as I know it has not been easy for them. They have had to put up with my long working hours, and placing most of my attention on my research. Without their acceptance and support this thesis would have been extremely difficult to complete.

I would like to thank Prof. C.H. MacKenzie for editing my thesis and ensuring that the final document delivered was up to standard.

Lastly, I would like to thank the National Research Foundation (NRF) for their financial support. Without it I would not have been able to pursue this research, which has required much time and effort, and has incurred various costs.

ii III. Table of Contents

I. ABSTRACT ...... I II. ACKNOWLEDGEMENTS ...... II III. TABLE OF CONTENTS ...... III IV. LIST OF FIGURES ...... VII V. LIST OF TABLES ...... VIII VI. LIST OF ABBREVIATIONS ...... IX 1 INTRODUCTION ...... 2

1.1 INTRODUCTION ...... 2 1.2 BACKGROUND LITERATURE ...... 3 1.3 OBJECTIVES ...... 4 1.4 RESEARCH QUESTIONS ...... 5 1.5 RESEARCH METHODOLOGY ...... 6 1.6 THESIS STRUCTURE ...... 8 1.6.1 Part One: Introduction ...... 8 1.6.2 Part Two: literature review ...... 8 1.6.3 Part Three: model ...... 9 1.6.4 Part Four: implementation, critical evaluation and conclusion ...... 10 1.7 CONCLUSION ...... 11 2 THE DIGITAL ENVIRONMENT ...... 13

2.1 THE DIGITAL ENVIRONMENT ...... 13 2.2 HARDWARE AND SOFTWARE CHANGES ...... 15 2.2.1 Computer and network power ...... 15 2.2.2 Virtualisation of operating systems ...... 16 2.2.3 Advanced local and internet-wide searching ...... 16 2.2.4 Broadband and wireless connectivity proliferation ...... 17 2.2.5 Convergence of technologies ...... 17 2.3 SOCIAL CHANGES ...... 18 2.3.1 Technology development in different locales ...... 18 2.3.2 Expanding online communities ...... 18 2.4 DIGITAL AND SMART CITIES ...... 19 2.5 CONCLUSION ...... 20 3 INTELLIGENT AGENTS...... 22

3.1 INTELLIGENT AGENTS, THEIR TYPES AND APPLICATIONS ...... 22 3.1.1 The task environment ...... 23 3.1.2 Types of agent programs ...... 24 3.2 MULTI-AGENT SYSTEMS ...... 31 3.2.1 Cooperative multi-agent systems ...... 32 3.2.2 Competitive multi-agent systems ...... 35 3.3 CONCLUSION ...... 38 4 MACHINE LEARNING ...... 41

4.1 MACHINE LEARNING ALGORITHMS ...... 43 4.1.1 The experience ...... 44 4.1.2 The task ...... 45

iii 4.1.3 The performance measure ...... 46 4.2 NEURAL NETWORKS ...... 48 4.2.1 Perceptrons ...... 51 4.2.2 Sigmoid neurons ...... 53 4.3 DEEP LEARNING ...... 57 4.3.1 Increasing dataset size ...... 59 4.3.2 Increasing model size ...... 59 4.3.3 Increased accuracy and impact...... 60 4.4 EXTREME LEARNING MACHINES ...... 60 4.4.1 Feed-forward neural network ...... 61 4.4.2 Extreme learning machine ...... 64 4.5 CONCLUSION ...... 65 5 PROGRAMMING PARADIGMS IN SOFTWARE ENGINEERING ...... 68

5.1 PROGRAMMING PARADIGM ...... 68 5.1.1 Observable nondeterminism ...... 71 5.1.2 State ...... 72 5.2 PROGRAMMING CONCEPTS ...... 73 5.2.1 Independence ...... 73 5.2.2 Records ...... 75 5.2.3 Lexically scoped closures ...... 76 5.2.4 Named state ...... 77 5.3 DATA ABSTRACTION ...... 78 5.3.1 Inheritance and composition ...... 81 5.3.2 Polymorphism ...... 82 5.4 OBJECT-ORIENTED PROGRAMMING ...... 83 5.5 ENTITY COMPONENT SYSTEMS ...... 84 5.5.1 Data-driven vs object-driven ...... 86 5.5.2 Entities ...... 89 5.5.3 Components ...... 90 5.5.4 Systems ...... 92 5.6 CONCLUSION ...... 93 6 PROBLEM BACKGROUND ...... 96

6.1 THE GROWING WORLD ...... 96 6.2 ADVANCING TECHNOLOGY ...... 99 6.3 EVOLVING INTERACTIONS ...... 100 6.4 CONCLUSION ...... 102 7 DEEV-MOS ...... 105

7.1 CONSIDERING THE DEEV-MOS MODEL IN THE CONTEXT OF DIGITAL ENVIRONMENTS ...... 107 7.2 THE DEEV-MOS MODEL ...... 108 7.2.1 The environment component ...... 109 7.2.2 The entity component system component ...... 110 7.2.3 The constraints engine ...... 110 7.2.4 The predictive modelling engine ...... 110 7.2.5 The events engine ...... 111 7.3 CONCLUSION ...... 111 8 DEEV-MOS: DIGITAL ENVIRONMENT ...... 114

8.1 THE DIGITAL ENVIRONMENT VS A DIGITAL ENVIRONMENT ...... 115 8.2 DEEV-MOS DIGITAL ENVIRONMENT ENTITIES ...... 116 8.2.1 Users (agents) ...... 117 8.2.2 Data ...... 119 8.2.3 Network node/server ...... 119 8.2.4 Network path ...... 120 8.2.5 Fixed endpoint device ...... 121

iv 8.2.6 Mobile endpoint device ...... 121 8.3 DEEV-MOS DIGITAL ENVIRONMENT TASK ENVIRONMENT...... 123 8.3.1 PEAS description ...... 124 8.3.2 Task environment properties ...... 125 8.4 CONCLUSION ...... 127 9 DEEV-MOS: ENTITY COMPONENT SYSTEM ...... 131

9.1 ENTITIES ...... 133 9.2 COMPONENTS ...... 136 9.3 SYSTEMS ...... 138 9.4 CONCLUSION ...... 140 10 DEEV-MOS: PREDICTIVE MODELLING ENGINE ...... 144

10.1 DEEV-MOS MACHINE LEARNING ...... 145 10.1.1 Storage size ...... 150 10.1.2 Computational power ...... 151 10.1.3 Network speed ...... 151 10.1.4 File size ...... 152 10.1.5 Network size ...... 153 10.2 CONCLUSION ...... 154 11 DEEV-MOS: CONSTRAINTS ENGINE ...... 158

11.1 FUNCTION OF THE CONSTRAINTS ENGINE ...... 159 11.1.1 Predictive modelling engine interaction ...... 160 11.1.2 Digital environment interaction ...... 160 11.1.3 Entity component system interaction ...... 161 11.1.4 Events engine interaction ...... 161 11.2 DEEV-MOS CONSTRAINTS ...... 162 11.2.1 Digital environment constraints ...... 162 11.2.2 Entity component system constraints ...... 163 11.2.3 Events engine constraints ...... 163 11.3 CONCLUSION ...... 164 12 DEEV-MOS: EVENTS ENGINE ...... 167

12.1 ROLE OF THE CONSTRAINTS ENGINE ...... 168 12.2 POPULATION EVENTS ...... 170 12.3 TECHNOLOGICAL EVENTS ...... 170 12.4 SOCIAL EVENTS ...... 171 12.5 CONCLUSION ...... 172 13 DEEV-MOS IMPLEMENTATION: DEEP ...... 175

13.1 DEVELOPMENT PLATFORM...... 177 13.2 THE UNITY GAME ENGINE ...... 178 13.2.1 Motivation for using Unity ...... 180 13.3 IMPLEMENTATION OF DEEV-MOS PROTOTYPE ...... 186 13.3.1 the DEEv-MoS digital environment ...... 188 13.3.2 the DEEv-MoS entity component system ...... 189 13.3.3 DEEv-MoS agents ...... 197 13.3.4 the DEEv-MoS predictive modelling engine ...... 200 13.3.5 the DEEv-MoS constraints engine ...... 205 13.3.6 the DEEv-MoS events engine ...... 206 13.4 CONCLUSION ...... 208 14 RESULTS ...... 214

14.1 RESULTS ...... 214 14.1.1 Testing setup ...... 215 14.1.2 Testing metrics ...... 217

v 14.1.3 Testing results ...... 218 14.2 CONCLUSION ...... 224 15 CRITICAL EVALUATION AND CONCLUSION ...... 227

15.1 THESIS SUMMARY ...... 227 15.2 DEEV-MOS CONTRIBUTION TO THE RESEARCH DOMAIN ...... 230 15.2.1 Research question review ...... 233 15.3 CRITICAL EVALUATION ...... 236 15.3.1 Critique of the model ...... 236 15.3.2 Future research ...... 238 15.4 CONCLUSION ...... 241 16 APPENDIX A: EVOLUTION ...... 242 16.1.1 Genetic algorithms ...... 243 REFERENCES ...... 246

vi IV. List of Figures Figure 3.1 Simple reflex agent program structure schematic (Russell & Norvig, 2010)...... 26 Figure 3.2 The model-based reflex agent program structure schematic (Russell & Norvig, 2010)...... 27 Figure 3.3 Goal-based agent program structure schematic (Russell & Norvig, 2010)...... 28 Figure 3.4 Utility-based agent program structure schematic (Russell & Norvig, 2010)...... 29 Figure 3.5 Learning agent program structure schematic (Russell & Norvig, 2010)...... 30 Figure 4.1 The relationship between error and capacity, depicting the underfitting and overfitting regimes (Goodfellow et al., 2016)...... 47 Figure 4.2 General shallow neural network structure, showing inputs, neurons, connections and outputs (Nielsen, 2015)...... 50 Figure 4.3 High-level perceptron structure (Nielsen, 2015)...... 52 Figure 4.4 Sigmoid function shape, showing how the z value affects the output value for the sigmoid neuron (Nielsen, 2015)...... 55 Figure 4.5 Perceptron function shape; a step function (Nielsen, 2015)...... 56 Figure 4.6 The sigmoid function’s smoothing of the step function allows for small changes in weight to translate into small changes in the output (Nielsen, 2015)...... 56 Figure 4.7 The Neural Network Zoo, representing the progression and various architectures of modern neural networks (Van Veen & Leijnen, 2019)...... 58 Figure 5.1 Relationship between programming languages, paradigms and concepts (Van Roy, 2009)...... 69 Figure 5.2 A taxonomy of programming paradigms, grouped by key properties, and showing concepts that differ from one to another (Van Roy, 2009)...... 70 Figure 5.3 Levels of support for state with main paradigm families arranged by expressiveness (Van Roy, 2009)...... 72 Figure 5.4 The state transformer configuration for a program (Van Roy, 2009)...... 78 Figure 5.5 Structure of a data abstraction (Van Roy, 2009)...... 79 Figure 5.6 Data abstractions plotted against state and bundling (Van Roy, 2009)...... 80 Figure 5.7 Inheritance and composition from a high-level principle view (Van Roy, 2009). .. 82 Figure 5.8 Object-oriented inheritance hierarchy for game design (Lord, 2012)...... 87 Figure 5.9 Entity component system design structure (Unity Technologies, 2019)...... 88 Figure 5.10 Entities are represented with numeric IDs and serve on a high-level as a list of components (Martin, 2007)...... 90 Figure 6.1 Human population growth over the past 12,000 years (Roser, Ritchie & Ortiz- Ospina, 2019)...... 98 Figure 6.2 The growth in internet usage by population percentage (Roser, Ritchie & Ortiz- Ospina, 2019)...... 101 Figure 7.1 The DEEv-MoS Model...... 108 Figure 8.1 The digital environment component of the DEEv-MoS model...... 114 Figure 8.2 The DEEv-MoS digital environment and entities...... 118 Figure 8.3 UML component diagram describing the relationships between entities in the DEEv-MoS digital environment (created with draw.io)...... 123 Figure 9.1 The entity component system component of the DEEv-MoS model...... 131 Figure 9.2 UML describing the DEEv-MoS entity component system...... 132 Figure 10.1 The predictive modelling engine component of the DEEv-MoS model...... 144

vii Figure 10.2 Planned neural network architecture of extreme learning machine in the predictive modelling engine...... 147 Figure 11.1 The constraints engine component of the DEEv-MoS model...... 158 Figure 11.2 Component diagram of the constraints engine and data flow with the other DEEv-MoS components...... 160 Figure 12.1 The events engine component of the DEEv-MoS model...... 167 Figure 12.2 Component diagram of the events engine component of the DEEv-MoS model...... 169 Figure 13.1 Example configuration of a learning environment within Unity ML-Agents (Juliani, 2017)...... 179 Figure 13.2 The Unity game development environment...... 180 Figure 13.3 Component diagram of the DEEP entity component system...... 183 Figure 13.4 Example of the Unity ECS in practice (Unity Technologies, 2019)...... 184 Figure 13.5 Sample run of a simulation with the DEEP prototype system...... 187 Figure 13.6 High-level structure of DEEP prototype in terms of Unity...... 188 Figure 13.7 DEEP prototype system ECS entity representations: (from left to right) agent, wireless device, wireless network path, fixed device, wired network path and server...... 192 Figure 13.8 High-level graph structure of DEEP RNN...... 201 Figure 13.9 DEEP RNN node structure...... 202 Figure 13.10 DEEP RNN hidden layer 1 node expanded...... 203 Figure 13.11 DEEP RNN hidden layer 2 node expanded...... 204 Figure 13.12 Small DEEv-MoS digital environment on initialisation...... 209 Figure 13.13 DEEv-MoS digital environment after events that introduced further agents, devices and network paths...... 210 Figure 14.1 Effect of simulation duration on digital environment accuracy rating...... 220 Figure 14.2 Effect of initial population size on digital environment accuracy rating...... 221 Figure 14.3 Effect of initial server number on digital environment accuracy rating...... 221 Figure 14.4 The distribution of training data for the predictive modelling engine...... 222

V. List of Tables Table 15.1 DEEv-MoS prototype system test run results...... 219 Table 16.1 Summary of secondary research questions and their related chapters...... 232

viii VI. List of Abbreviations

AI Artificial Intelligence CNN Convolutional Neural Network CPS Cyber Physical Systems CSP Constraint Satisfaction Problem DE Digital Environment DEEP Digital Environment Evolution Predictor DEEv-MoS Digital Environment Evolution Modelling and Simulation ECS Entity Component System ELM Extreme Learning Machine FFNN Feed-forward Neural Network GA Genetic Algorithm GUI Graphical User Interface GUID Globally Unique Identifier IoT Internet of Things MAS Multi-Agent System ML Machine Learning MLFN Multiple-hidden Layer Feed-forward Neural Network MLP Multi-layer Perceptron NN Neural Network OOP Object-Oriented Programming RFC Request for Comments RNN Recursive Neural Network SLFN Single-hidden Layer Feed-forward Neural Network TCP Transmission Control Protocol UI User Interface UML Unified Modelling Language

ix

1 1 Introduction 1.1 Introduction The Internet and modern networks have changed how people communicate, interact and learn. With information from around the world being made available at the touch of a button, the world has, in many ways, become a smaller place (Perkins & Thomson, 2016). Physical machines and hardware are used to create the illusion of an infinitely large place where all information can exist. Furthermore, many important details about a person now exist in an abstract space that has no defined boundaries and limits, and many people rely on the information stored online to function in modern society (O’Connell, 2012).

The Digital Environment, formed by the amalgamation of the various physical, logical and abstract entities used in the modern world and that encompass all spheres of cyberspace (the electronic medium that facilitates online communication through a global computer network), is a complex and ever-changing space (Laifa, Akrouf & Maamri, 2015). The Digital Environment differs from a digital environment much the same way that the Internet differs from an internet. The former refers to a worldwide entity and the latter to a smaller localised entity that is a subset of the Digital Environment. Through the remainder of this thesis the Digital Environment will be denoted by DE while digital environments will remain as is for brevity and clarity. It can be overwhelming to consider the DE in its entirety, and, for simplicity’s sake, it can be divided into multiple smaller and more manageable digital environments that are merely a subset of the whole. Even on this more granular scale, the rate at which digital environments are growing is exponential, with many more networks and computing devices being produced and used every year (Skoudis, 2009).

The Internet and both traditional computing and mobile devices are in a constant state of change and growth (The Radacati Group, 2014). With such a high rate of growth and change, it is difficult to predict how a digital environment will evolve and whether this will have any adverse effects on the future digital environment. Predicting change can be a complex process, where many factors that may bear influence on the subject need to be considered (Lehmann, Rolfsen & Clark, 2015). There may exist many intricate interdependencies between the variables of the problem space, each of which needs to be considered, for example, the relationship between hardware development (such as microchip density) and the capacity for larger, more stable networks (Holloway, 2004).

There are existing ‘rules’ that researchers, technologists and business people have used as guidelines over the years for predicting how technology and its use will grow over time. These rules are generally fairly superficial and only deal with a single aspect of a given technology, not considering the broader picture of the effect many changes in multiple technologies will have in the long term (The Tran, Eklund & Cook, 2013).

2 Artificial intelligence (AI) has developed vastly over the last two decades, with new fields of focus being created to deal with specific problems faced in the real world. The various sub- fields that AI is made up of have given us tools to solve generic types of problems, which can be specialised based on the intricacies of available data, knowledge of the environment and interdependencies.

AI has allowed for the development of large-scale automated systems, controlled by computers with little to no human intervention. Computers have ‘learned’ how best to execute processes, evaluate risk versus reward, and draw on previous experience to predict the outcome of particular events or situations.

Multi-agent systems (MAS) form one of the sub-fields of AI. MASs are widely used today in the application of computer intelligence in solving large volumes of smaller sub-problems that, together, affect a much larger problem. MASs make use of numerous intelligent agents to solve a problem by assigning tasks to individuals or groups of agents.

Machine learning (ML) is another sub-field of AI. It deals with the ability of computers to learn from previous experiences or data for a number of applications, including, but not limited to, the following:

 making choices geared towards achieving a particular outcome;  identifying anomalies in systems or data;  predicting various outcomes, both long term and short term; and  classifying objects or behaviours.

The field of AI has much to offer to researchers or people in industry when it comes to implementing more effective, accurate and efficient solutions to vast multi-domain problems, which are becoming more commonplace in this world of ever-blurring boundaries between the physical and digital.

When it comes to understanding where the DE is heading, along with the various digital environments that comprise it, there is yet to be a solution that takes into account multiple factors and the cross-domain effects that they would have. It is apparent that a more accurate and realistic means for us to understand what the future of the DE holds for us and what possible opportunities or problems could arise is needed.

Section 1.2 takes a more in-depth look at literature concerning the problem domain that has been defined in this section.

1.2 Background literature The DE is in a state of constant flux, with new entities being added and others being removed on a continual basis. Physical and logical networks grow and shrink as new links are formed and others are terminated, content is uploaded and deleted from servers as it

3 becomes more or less relevant, and mobile connections are made and terminated, as needed, by people around the world (Frömming et al., 2017).

The DE is a complex overlap between the physical and abstract digital worlds. For cyberspace to exist, we require physical hardware on which information is stored and transmitted. Networks are formed, consisting of large numbers of computers connected to each other and to other smaller networks. These can be closed loop networks or they can be connected to the Internet. In the modern world, these networks are more dynamic in their size than ever before due to the large-scale use of mobile devices.

Mobile devices are used more and more as a means to connect to and share information with networks. It is estimated that there are more than 4.5 billion mobile phone users worldwide and that 51.2% of all global online traffic is from mobile internet traffic (Chaffey, 2018). This is a significant proportion of all network traffic and requires special consideration going forward in this thesis.

Research into the growth and change of networks is nothing new, with many variants of algorithms and models being developed over the years to understand and try to predict the evolution of such networks. This ranges from predicting the evolution of large-scale physical networks (Wu & Chen, 2016) to that of more abstract social networks (Bringmann et al., 2010).

More recently, Wu and Chen (2016) proposed a method for creating structure-dependent indexes that would strengthen the predictions of link formations. This was done in an iterative method, whereby network nodes’ positions were updated at each iteration to discover trends in network evolution, which could then be used for far more accurate predictions of future links in the networks.

Bringmann et al. (2010) made use of an altogether different method for predicting the evolution of social networks: using association rules and frequent-pattern mining to gain insight into evolving network data. Their proposed Graph Evolution Rule Miner (GERM) was designed to extract rules by searching for typical patterns in the structural changes of these networks and then apply the rules to predict future evolution.

Section 1.3 will consider the key objectives of this thesis, taking into consideration the background information discussed in sections 1.1 and 1.2.

1.3 Objectives The modern world as we know it is reliant on information stored and transmitted in the DE, with businesses, schools and individuals all making use of this in their day to day activities (Frömming et al., 2017). The DE is, however, always changing, with additions to its networks and entities occurring almost constantly. In order to understand where the modern world will be in the future, it is important to understand how the behemoth that is the DE evolves over time and what impact this will have on those that participate in it.

4 Based on the literature covered in sections 1.1 and 1.2, it is apparent that a new model that is capable of predicting the evolution of digital environments in a more holistic manner is required. In order to create this model, the following objectives need to be achieved and presented in this thesis:

 This thesis sets out to define a model that is capable of taking into account the change and growth of multiple different entities within digital environments on both a micro and macro level so as to understand the overall impact on the DE.  The model will attempt to predict the evolution of selected non-natural entities in a manner that is realistic and not reliant on conventional biological evolutionary concepts.  The model will make use of design patterns, algorithms and concepts from a number of fields in computer science to create accurate predictions.  The prototype MAS system will not only serve to predict the evolution of a digital environment and its entities but also to simulate this so that the design and implementation of the model can be evaluated.

The objectives defined above were derived from the need to answer the primary research question, which is fundamental to the direction and execution of this study. The primary research question is comprised of a number of more granular secondary research questions. The following section discusses both the primary and secondary research questions of this thesis.

1.4 Research questions In section 1.4 the objectives of this study, which serve to guide the purpose of this thesis, were outlined. The objectives were derived from the primary research question posed below, which drives the direction of this thesis, its content and the methodologies used.

Primary Research Question

RQ 1: Can an MAS, making use of machine learning and an entity component system design, effectively and accurately predict the evolution of a digital environment so as to provide assistance in understanding what the future of a digital landscape will look like?

Although entity component system design is included in the primary research question, the motivation behind this will only be addressed in Chapter 9 of this thesis.

Due to the primary research question defined above dealing with a number of separate fields, several secondary research questions are posed that will help address problems pertaining to various key aspects of the primary question. These diverse secondary research questions are listed below.

5 Secondary Research Questions

SRQ 1 What is the current understanding of what the Digital Environment is, and what does it encompass?

SRQ 2 Can an MAS be used to simulate the various components of a digital environment?

SRQ 3 Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

SRQ 4 Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ 5 Can ML, in particular extreme learning machines, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

SRQ 6 How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

To effectively answer the primary and secondary research questions, it is important that quality research is carried out, covering the relevant domains in a comprehensive manner. The methodology of how the research will be conducted must be clearly defined so that the thesis can be structured and executed effectively and so that it adds value to the field. Section 1.5 will discuss the research methodologies that this thesis will make use of.

1.5 Research methodology When conducting research, it is important to start with a firm foundation on which to build: the methodology that the research follows (Creswell, 2014). This section will cover the research approach and methodology used throughout this thesis. In defining a clear methodology to follow when doing research, we ensure that a logical and structured approach is taken, making it easier to show the research process from start to end and to ensure that measurable results are produced (Bergman, 2008; Creswell, 2014; Winch et al., 2005).

When considering research methodologies, there are three broad categories: qualitative, quantitative and mixed methods (Bergman, 2008). Different types of research in various fields of study make use of different methods more often than others due to the nature of the research. Qualitative interview or questionnaire-style research is often used in social sciences research, whereas quantitative number-based research is more prevalent in scientific and mathematical research (Bergman, 2008; Creswell, 2014). It is often encouraged that research be conducted making use of a single methodology as it deals with

6 a single form of data collection and measurement. However, there are times when a mixed methods approach would be of greater value (Bergman, 2008). Making use of a mixed methods research approach allows for flexibility in how the research is conducted for various parts of the problem domain, creating a more complete understanding of the research problem than if a single methodology was used (Creswell, 2014).

In this thesis a mixed methods approach was followed. The following different methodologies and approaches were used for different areas of the problem domain:

 Qualitative research into the various domains concerning the research problem, developing an understanding of the benefits, drawbacks and state of the art for each domain (Creswell, 2014; Olivier, 2009). This is presented in the form of a literature review and case studies.  Design science research adapted for information systems. This is carried out in a process beginning with the identification of the research problem, then designing and implementing a solution and finally, evaluating the resultant solution (Hevner et al., 2004; Peffers et al., 2007; Peffers et al., 2012). This is presented in the form of prototype development.

The research in this thesis was carried out in the following manner: a design science approach was used to identify and define a research problem that is relevant to the field of computer science (Hevner et al., 2004). The research problem was backed by a literature review and case studies to determine the current state of the problem, including related works, attempts at solving the problem and previous shortcomings (Peffers et al., 2007; Peffers et al., 2012; Olivier, 2009).

Next, in-depth qualitative research was conducted into the various sub-domains of the research problem, including digital environments, multi-agent systems, entity component system designs and machine learning. In doing so, an understanding of these domains was created, determining the benefits and drawbacks associated with each, what the current state of the art is, and what other work has been done in the problem space (Bergman, 2009).

A new model was then defined as a theoretical solution to the research problem, taking into account the findings from the qualitative research phase. This model was designed as a novel solution to the research problem, incorporating knowledge of previous works in the field and their results, along with alternative designs and approaches.

Lastly, a prototype of the theoretical model was implemented as part of the design science methodology so that measurement and evaluation of the proposed model could be carried out (Hevner et al., 2004; Peffers et al., 2007; Peffers et al, 2012). This prototype was used to run simulations to determine the effectiveness of the model, while also allowing for the

7 adjustment and refinement of parameters in the system (Suarez et al., 2015). By doing so, a critical evaluation of the model could be carried out, whereby its strengths and shortcomings could be determined (Olivier, 2009).

The modelling and simulation of change in the DE are important functions to understand the DE and its components, and are required tools in the research methodology due to the time sensitive nature and need to determine correctness (Abu-Taieh, 2019).

Section 1.6 outlines the structure of the remainder of this thesis.

1.6 Thesis structure This thesis covers a number of fields and details important information regarding a solution to the problem stated in this chapter. The thesis consists of fifteen chapters and is divided into four main parts, structured as follows:

1. Part One: covers the problem background and introduces the problem domain. 2. Part Two: covers relevant background information required to create a model as a solution to the problem stated in Part One. It is presented as a literature review. 3. Part Three: covers the model as a proposed solution, defining it in the context of digital environments along with its components and its functioning. 4. Part Four: covers the implementation of a prototype for testing, along with the results obtained and a critical evaluation of the thesis.

1.6.1 Part One: Introduction Chapter 1 introduces the thesis and provides a short assessment of the problem and the methodologies that will be used in conducting the research. The chapter provides a brief introduction to the concept of digital environments and presents the problem background in detail, highlighting the need to be able to predict future growth and change in the DE. Thereafter, the thesis objectives, methodologies used and the research questions that are used throughout the thesis are defined.

1.6.2 Part Two: literature review Chapter 2 provides a detailed coverage of the DE, defining what it encompasses and how its various parts drive its evolution. The changes in the DE are categorised into social changes and hardware and software changes, which drive its evolution in different ways. It is ultimately shown that the DE is an amalgamation of the physical, digital and cyber worlds, making it a complex entity to understand and predict. This chapter lays a foundation for understanding the DE, which is important for the remainder of the thesis.

Chapter 3 covers the topic of intelligent agents. The different agent types and their applications are highlighted, discussing the impact of the environment they are deployed in and the need for multi-agent systems. MASs are discussed with regard to the impact on agent decision making and goal achievement. Finally, agent evolution is discussed in the context of genetic algorithms and how they can be used to grow diverse agent populations.

8 This chapter introduces intelligent agents, which are important to the design of a multi- entity solution to the primary research question.

Chapter 4 is focused on the field of machine learning. The history and background of ML is discussed, detailing how it progressed to become the popular field of research it is today and how it is being used in business more frequently. Different types of ML approaches are discussed, including supervised learning, unsupervised learning and reinforcement learning. Deep learning is detailed with regard to its ability to improve learning of unknown domains and solve problems that previously required a large amount of human input. This chapter describes machine learning, which is a key component in creating a predictive model of the DE.

Chapter 4 also discusses extreme learning machines, a specific neural network implementation that is popular for its fast learning rate. The difference between EML and more traditional Feed-forward Neural Networks (FFNNs) is highlighted, focussing on the layer/node architecture and the learning techniques employed. EML are shown to be effective NNs with great relevance in solving modern world problems.

Chapter 5 addresses programming paradigms. The difference between programming concepts, programming paradigms and programming languages is highlighted, along with the relationships between these three core parts. Important programming concepts are discussed, along with the impact of including them in a paradigm. Object-oriented programming and entity component systems (ECS) are briefly defined and compared as important programming paradigms, and the value of ECS approaches over the use of OOP in particular scenarios is highlighted. This chapter covers the concept of programming paradigms and shows that OOP is not always the best candidate for modern programs and that entity component systems are better suited for data-driven problem spaces.

Chapter 6 takes a more detailed look at the problem background presented in Chapter 1. The problem is broken up into three main areas of influence: the growing world, advancing technology and evolving interactions. These three areas more clearly define the various moving parts involved in the DE and how they affect it as a whole, highlighting the importance of being able to understand its future evolution. This chapter provides further detail on the initial problem background presented in Chapter 1, identifying three areas of influence that must be considered in designing a predictive model for the DE.

1.6.3 Part Three: model Chapter 7 presents the DEEv-MoS model in overview. This is the result of combining the findings and information presented in the previous six chapters. The functioning of the model and its components are defined, briefly discussing the relationships and responsibilities of each component. The DEEv-MoS model is positioned with regard to the context of the DE and its many constituent parts.

9 Chapters 8 – 12 provide greater technical depth to the individual components of the DEEv- MoS model.

Chapter 8 takes an in-depth look at the DEEv-MoS digital environment component of the DEEv-MoS model. The difference between the DE and a digital environment is highlighted for the purpose of the DEEv-MoS model. The components of the DE that are modelled in the DEEv-MoS digital environment are defined, along with their properties and abilities, which are defined in terms of an ECS. Lastly, the DEEv-MoS digital environment is described in terms of task environments and a PEAS description.

Chapter 9 is an in-depth discussion of the DEEv-MoS entity component system component of the DEEv-MoS model. The DEEv-MoS model is broken down based on the three main parts of an ECS: entities, components and systems. Each entity in the DEEv-MoS digital environment is defined as one of the three parts, specifying their function and interactions with other entities.

Chapter 10 discusses, in-depth, the DEEv-MoS machine learning engine component of the DEEv-MoS model. The main factors of the DEEv-MoS model that need to be predicted in order for accurate evolution of the DE to occur are defined. The importance and role of each factor is described, along with the entities in the DEEv-MoS digital environment that they affect. For each factor, the planned ML model, inputs and outputs are described in order to achieve future predictions.

Chapter 11 takes an in-depth look at the DEEv-MoS constraints engine component of the DEEv-MoS model. The function of this component of the DEEv-MoS model is described, detailing its interaction with the other components of the DEEv-MoS model and how constraints are made use of. The three main types of constraints, namely digital environment constraints, entity component system constraints, and events engine constraints, are discussed, and the individual constraints that are made use of in the DEEv- MoS model for each type are defined.

Chapter 12, discusses, in-depth, the DEEv-MoS events engine component of the DEEv-MoS model. Firstly, the interaction between the events engine and the constraints engine components of the DEEv-MoS model is detailed further. Thereafter, the three main event types in the DEEv-MoS model are described: population events, technological events and social events. The different events created by the events engine are described, including their intended purpose and how they affect the DE and other components of the model.

1.6.4 Part Four: implementation, critical evaluation and conclusion Chapter 13 details the implementation of the DEEP prototype. Considerations regarding the development platform are presented and the Unity game development platform was chosen for the implementation of the DEEv-MoS prototype. The implementation of each component of the DEEv-MoS model is described in detail, highlighting features from the chosen development platform that complemented each one. The various entities of the

10 DEEv-MoS digital environment are presented in the context of the implementation, using Unity and the chosen ECS programming approach.

Chapter 14 presents the results of testing the DEEP prototype system, defining the metrics used and how they are interpreted.

The conclusion of this thesis is presented in Chapter 15, where a summary of the thesis is provided, along with a critical evaluation of the research conducted and possible future work. Finally, a conclusion is presented, summarising the problem statement, research questions and the results obtained.

1.7 Conclusion Chapter 1 has served as an introduction to the research problem being addressed in this thesis, focusing on the sub-domains involved, the research methodology that will be followed (and why), ultimately providing motivation for the creation of a new model in the prediction of the evolution of the rapid and ever-changing DE.

Firstly, the problem background was introduced, followed by the resultant objectives and primary and secondary research questions that this thesis will address. Some brief case studies and a literature review were presented to motivate the need for a new model, after which the thesis’ research methodology was discussed and defined. Lastly, an overview of the structure of the remainder of the thesis was presented.

Chapter 2 will discuss the problem domain of the DE, specifying what the DE encompasses, how it is currently changing and at what pace, and lastly what impact The DE’s evolution has on the modern world.

11

12 2 the Digital Environment In Chapter 1, an introduction to the thesis was presented, covering the introduction to the problem, some high-level background information, the research objectives and research questions, the research methodology and the structure of the thesis.

Chapter 2 comprises part of the background literature study and will go on to delve deeper into the topic of the DE: what it is, how it is changing and the implications for the world we live in.

2.1 the Digital Environment For most people in today’s modern world, the concept of the DE is vague and loosely defined. They hear terms such as ‘digital’, ‘cyber’, ‘information’, ‘networks’ and many others on a daily basis in relation to multiple spheres of their everyday lives, including work, social, legal, political and entertainment.

In a similar vein to digital environments is the concept of digital systems, which can be defined as being systems designed to process, store and transmit information in a digital format. This ultimately encapsulates various forms of computational machines which are employed for a number of purposes, such as control systems, consumer products and communication systems where information in binary form is manipulated (Rafiquzzaman, 2014).

Digital systems can also be composed of other digital systems, where each subsystem performs a specific information manipulation task which is then used by another subsystem to perform its task (Donzellini, 2019).

With the various buzz words associated with the DE being interchanged and regularly used out of context, it makes sense why there is confusion around what the actual definition is. What makes things even more complicated is that the definition of and the lines between what is and is not part of the DE are constantly changing and shifting as the world changes and new technological discoveries and inventions are made (Skoudis, 2009).

There are two definitions of DEs according to Khosrow-Pour, 2017:

1. “A virtual or cyber-generated environment accessed or created through the use of one or more digital devices such as a computer, tablet, or a cellular phone.” 2. “All information environments that are mediated via the World Wide Web or similar mobile devices. In particular those environments that facilitate the discovery and search of information, people, and resources.”

The above definitions allude to digital systems that store information and can be searched and interacted with via physical devices such as computers, phones and networks. This

13 indicates that the DE consists of more than just digital components and information, but also physical hardware and the systems in between that allow for their interaction (Council of Europe, 2016). Compared to digital systems, digital environments are not concerned only with the manipulation of digital information, but also the physical storage, transmission, and dissemination thereof; digital systems form a part of digital environments (Donzellini, 2019).

A more complete definition of the Digital Environment would be to consider it from a cyber physical systems (CPS) point of view. The DE in the real world can be considered a massive CPS due to it being made up of communication, processing and computing technologies which are brought together through physical artifact and engineered systems knowledge (Boursinos & Koutsoukos, 2020). This type of non-monodisciplinary field covers many different disciplines, combining them in various ways to create novel solutions to next- generation problems that focus on symbiosis, automation and smart system behavior among others.

The taxonomy of CPS is ever evolving, with new technologies emerging on a continual basis. On a higher level however, CPS has two main classes of technology: physical technologies, which cover physical hardware, material and their use, and cyber technologies, which cover software, communication and networking. Horvath & Gerritsen (2012) have suggested a third class, synergic technologies, which covers technologies that couple physical and cyber technologies further blurring the lines between the physical and cyber worlds.

Even though this more holistic view is accurate for the real world DE, it is large and complex, and for this reason this thesis will consider the DE from a view point where the focus is on data communication systems and their components.

Digital environments do also come in many shapes and sizes, with the DE encompassing all. However smaller digital environments can be local to a physical location such as a university or business, or even at a scope such as a city. In any of these digital environments there are different interaction and behavior dynamics which can have a different effect on the next- level-up digital environment than in their local digital environment (Chayko, 2017).

Within each digital environment and the DE, evolution and growth can be observed in smaller local areas driven by direct interactions and behaviours which can also propagate up to the entire digital environment. A simple example of this would the growth of network infrastructure in a small city, allowing more of its residents to be connected with one another. The effect of this also propagates to the country level as more cities become inter- connected, with people across a larger geographic area able to communicate and interact with one another (Frömming et al., 2017).

Besides the evolution of digital environments in terms of infrastructure and size, as more people and larger volumes of information become connected, growth in interaction, collaboration and collective knowledge occurs. Information, knowledge and ideas can propagate, morph and grow to become greater than the sum of the initial inputs. Further

14 manifestations of evolution in the DE can stretch to social and economic change due to greater connectivity, access to information and services (Chayko, 2017).

Although the holistic CPS view of the DE is more accurate, Khosrow-Pour’s (2017) definition of the DE is more suitable for the scope and purposes of this thesis.

Changes in multiple areas of technological and social development have caused the blurring of the lines that divide what is part of the DE and what is not. These changes include developments in hardware and software, as well as social developments. The next sections will take a more focused look at the factors that have created the DE as we know it today.

2.2 Hardware and software changes In the early days of the internet and computer networks as we know them today, there was, in general, a distinction between the design of hardware and software. Hardware was designed by engineers, focusing on the electronics and manufacturing of chips and transistors to give computer hardware greater resources that could be used for computation. Software engineers then designed software as best they could to make the most use of the resources available from the hardware (Teich, 2012; Jain et al., 2018).

However, as time went by, a level of co-design (designing of the hardware and software in conjunction so as to achieve greater synergy and output), starting at the microprocessor level, started to become more prevalent (Haubelt, et al., 2002). Due to the complexity of modern computer systems and their comprising electronics, co-design is now an integral part of modern system implementations, allowing for faster delivery, more optimised performance and fewer flaws caused by the lack of alignment between hardware and software design (Lima et al., 2015).

The parallel design and development of hardware and software lead to the creation of the vast interconnected networks and computer systems of today. To a large extent, this is due to the following advancements: computer and network power, virtualisation of operating systems, advanced local network and internet-wide searching, proliferation of both broadband and wireless connectivity, and the convergence of the above technologies, among others (Skoudis, 2009; Kuhrmann et al., 2019).

Each of the above will be described in more detail below.

2.2.1 Computer and network power Since the development of the first computers and integrated circuits, scientists and engineers have been pushing to increase the number of micro-components that can be placed on a single board. In 1965, Gordon Moore predicted that the number of micro- components on a chip would double each year (this was later revised to every two years), thus creating faster, more powerful computers (Loeffler, 2018; Moore, 1965). The so-called ‘Moore’s Law’ has held true for the past 40 years, setting a goal for the industry that has led to improvements in materials, processes and manufacturing (Teich, 2012; Guo et al., 2017).

15 Along with the improved development of micro-chips came the ability to create and run larger and larger computer networks, allowing for the communication and transfer of information between those on the networks. Robert Metcalfe, one of the inventors of Ethernet technology, hypothesised that the value of any given network was proportional to the square of the number of users on the network and that the networks utility would grow exponentially as more users joined the network. Ultimately, he saw the value in having more and more users connected together to share information (Briscoe, Odlyzko & Tilly, 2006; Patnam & El Taeib, 2016).

Together, the realisation of both Moore and Metcalfe’s laws has created a world in which powerful micro-chips power computers and network hardware powers extremely large world-wide networks that have a growing number of users joining and accessing them to share information. This has played a pivotal role in the growth of the digital environment by allowing more and more people to be connected with each other and to have access to information resources.

2.2.2 Virtualisation of operating systems Over time, with the growth of the DE and information sharing networks, a need arose for decreased downtime and risk to the servers that the networks were comprised of. With a higher demand for information and communication with others around the world, unavailability due to hardware or software failures became a problem. The development of virtual machines (VMs), which allowed for multiple isolated instances of operating systems being run on a single computer, helped to curb the issues around server maintenance, failures and security (Chen, Du, Chen & Wang , 2018).

VMs allow for multiple servers to be run on a single consolidated hardware platform, where hardware failures can be overcome by deploying an image of the affected server to new hardware almost instantaneously without needing manual configuration or setup. This simplified maintenance and security, as affected VMs could be sand-boxed and restored to a known stable state if a failure occurred or malicious code was detected (Majeed, 2017).

Ultimately, virtualisation has provided the ability for robust always-up networks and servers to exist, on which information can be shared and people can be connected, creating a digital environment with less fluctuation and instability.

2.2.3 Advanced local and internet-wide searching The huge growth of computer networks, their supporting hardware and the volume of information stored created its own problem: how to effectively find information. With larger and larger storage media created more cheaply due to the increase in micro- component density, computers and networks have the ability to store Terabytes of information, which makes it extremely difficult to find specific information (Liu et al., 2017).

To solve this, improved software algorithms and techniques were developed and gave rise to search engines, such as Google (the most well-known and widely used search engine), as

16 what value did the plethora of information available on the internet have if you were unable to find what you were looking for. Google’s PageRank algorithm is a well known method used to determine the importance and relevance of website pages when searching (Page, 1997). The algorithms and techniques used by search engines have also been applied to smaller internal networks and to individual desktop computers as well (Di Sciascio, Sabol & Veas, 2017).

The development of advanced search capabilities allowed for greater usability and value of networks and, ultimately, the DE.

2.2.4 Broadband and wireless connectivity proliferation As the size of networks and information stores that make up the DE grew larger and different types of media were stored and shared, a need for greater internet speeds became apparent. Slow internet speeds were adequate when only text and low-quality images were being served; however, as technology improved and created high-quality audio and video, greater internet speed was required to interact with these items in a reasonable time.

Fibre optic technology allowed for larger, higher speed backbones to be created, ultimately filtering down to broadband connections on business and personal fronts. Interconnectivity through satellite and microwave towers allowed for disjoint networks to interact at high speed. And the introduction of widespread mobile wireless connectivity to cellphones and laptops increased the distances at which networks could be joined or interacted with, as no physical infrastructure was needed directly to the end connection terminal (Hintze, et al., 2017).

Further use of Wi-Fi and other wireless connectivity technologies has also opened the path for more rural and underdeveloped areas to gain access to information, mainly through cellphone use (it is forecast that there will be 4.68 billion mobile internet users in 2019). This has increased the ability for people on a mass scale to access networks and valuable information (Kumar, Kim & Helmy, 2013).

2.2.5 Convergence of technologies The ever-increasing development and use of micro-chips has led to a situation where the distinction between devices or platforms for telecommunication, computation, television and radio has become difficult to make. The rise of broadband and wireless connectivity has also allowed for the provision of these services over the Internet, and devices such as laptops, cellphones and even household appliances such as fridges have these functionalities integrated into them (Lee, 2018). This convergence of technology has created an environment for heterogeneous devices and communication, growing the DE greatly, with mixed media and platforms for interaction (Gordon, 2003; Lee & Zo, 2016).

This section discussed the hardware and software changes that have contributed to the development of the DE as it is today. However, these are not the only changes that were factors in the DE’s evolution. Social changes through the world, coupled with technology,

17 have also played an important role. The next section takes a closer look at the social changes that impacted the DE.

2.3 Social changes Even though technological factors have a major influence on the evolution of the DE, there are also social factors that have had a large impact. This section will discuss social changes that contributed to the current DE.

As technology has developed and improved, communication and sharing been made easier, the world around us is becoming a smaller place. People can share information and views with people from all around the world with varying cultures, beliefs and circumstances. This globalisation is, in part, due to the extent to which technology has allowed the DE to evolve. However, there were changes that occurred concurrently in different countries on a technological, social and economic front, which have created cross-border online communities and information sharing. These factors, which will be discussed below, did not only have an influence on the shaping of the DE, but were also a result thereof.

2.3.1 Technology development in different locales Traditionally, the United States was the hub for computer technology and development. However, over time, as networks grew and knowledge was shared, the industry was internationalised. Experts developed in various fields, and with economic and social factors at play, Asian countries such as China and Japan became deeply involved in the growth of the digital landscape, the first mostly in a hardware and manufacturing role and the second in a software and innovation role (Skoudis, 2009; Saganowski, 2015).

Currently, specialists exist all around the world, and companies that focus on various parts of the technological production chain are dispersed geographically. The result is a richer technological culture globally, where the DE crosses borders and cultures.

2.3.2 Expanding online communities With greater interconnectivity world-wide, people from different backgrounds and with different beliefs can communicate and share information. In this process, online communities form where people share information regarding their lives (as on social media sites), their opinions and thoughts (on blogs or forums) and their commercial likes and dislikes (on ecommerce sites such as Amazon) (Bruckman, et al., 1999; Jain, Mohan & Sinha, 2017).

Now, more than ever, people can create small or large communities around common interests or beliefs with those from the opposite side of the globe, thus creating larger networks and information stores. The digital environment then becomes a place not only of communication and information but of community and belonging.

In modern cities around the world, the previously mentioned technological and social changes have created a richer and more meaningful DE and have led to the integration of

18 technology and networks into everyday life. The next section looks at the concept of digital cities and smart cities as a result of the DE.

2.4 Digital and smart cities The integration of technology and information into everyday life and the infrastructure in the cities we live in has caused large changes in behaviour and how people go about their everyday lives.

Conceptually, these changes brought about the digital city and smart city, which, at face value, may appear to be similar. However, there are distinct differences between the two, with one being the precursor to the other. As networks and interconnectivity grew and the ability to make information available to a wide audience of people through these became apparent, cities and their departments realised that they could make use of this information sharing and gathering mechanism to improve how they interact with their residents (Linturi, Koivunen & Sulkanen, 2000; Aráuz, 2018).

This gave birth to the digital city, one where community networks were established, allowing citizens to gain access to information and interact with one another. Information and services provided by the digital city can include references to important events; maps and directories for businesses and stores; communication interfaces for text, voice and video communication; and can even take in data from sensors around the city for weather or other situational events (Ishida, 2017).

The next step in the evolution of the digital city was to make it more than an information hub by allowing for interaction and smart behaviour, giving rise to the smart city. In smart cities, transportation, homes and infrastructure are connected to the internet, and citizens are not only able to see information about these entities but to interact with them. With higher prevalence of IoT (Internet of Things) devices, sensors and actuators can be used to take in information and perform actions, as needed. Citizens can book transportation, buy products, turn appliances in their homes on and off, adjust electricity consumption and much more via the internet. The smart city steps outside of the confines of being purely digital to include the physical. (Ishida, 2017).

It appears that the development of smart cities may not necessarily be a fully planned and orchestrated process, with much of the change occurring organically as technological and social factors advance and become more widely available.

With the digital and physical world coming together in smart cities due to the technological and social changes that have occurred, the DE has bloomed to become an indispensable place for finding information, communication with others and understanding both the physical and digital world around us. The modern DE has transcended the traditional boundaries of what is physical and what is digital, becoming ubiquitous in almost all aspects of everyday life.

19 2.5 Conclusion Chapter 2 discussed the DE, what it encompasses and how changes in technology and social interaction throughout the world have shaped it. In doing so, a clearer understanding of digital environments, and by extension, of the DE was developed, which will serve to guide the research around building an effective model in this thesis.

Digital environments have been illustrated to evolve over time through network growth, increased data creation and exchange and the increasing number of interactions between parties involved in the environment regardless of physical location.

The information presented in this chapter is of great relevance to the following secondary research question:

SRQ1: What is the current understanding of what the Digital Environment is, and what does it encompass?

The chapter has clearly defined the various components of digital environments and, in particular, those of the DE, discussing how it began, what it has grown to be, and how changes in the modern world have caused it to evolve. Chapter 2 addresses SRQ1 by laying a foundation regarding the definition of the digital environment for the remainder of this thesis. Chapter 8 will provide further detail and a modern representation of the DE, which will serve as the basis for the thesis’ proposed model.

Now that an understanding of the DE has been created, the next of the literature review chapters, Chapter 3, will be focused on the topic of intelligent agents, multi-agent systems and how they can evolve. Chapter 3 will serve as a basis for the technical foundation of building a model designed to predict and simulate events, exploring the underlying fields that will be used to implement such a model and the eventual prototype.

20

21 3 Intelligent agents Chapter 2 concentrated on the DE, providing an understanding of all it encompasses, how both social and technological changes have shaped it thus far, and what could possibly occur in the future. A precise definition for the purposes of this thesis was derived and is the basis for the evolutionary model that further chapters will define.

Chapter 3 will serve as a literature review as part of an important background topic: intelligent agents. In this chapter, the various types of intelligent agents will be discussed, including their uses. Thereafter, the concept of multi-agent systems will be covered to develop an understanding of how multiple intelligent agents can operate in the same environment. Finally, the topic of evolution with regard to intelligent agents will be addressed, taking a look at how this can be used in the DEEv-MoS model defined later in this thesis, in Chapter 8.

3.1 Intelligent agents, their types and applications ‘Agent’ (which comes from the Latin agere, meaning to do), in the general sense, is a term used to refer to something that acts. In the computer world, this is viewed in a more specialised sense, where an agent is not software or a program that simply does something; instead, it is a program that acts in an autonomous manner while adhering to some form of rules and goals.

Wooldridge (2009) provided a more applicable definition for these agents that are expected to act with some intelligence based on what they perceive in their given environment: an agent is anything that can perceive and act upon its environment using sensors and actuators respectively. To build on this, a rational agent is an agent that, based on its percepts (any perceptual input, with a ‘percept sequence’ being the history of all the agent’s percepts), uses its actuators to act correctly (attempting to achieve the best outcome or best expected outcome) in the given environment, based on its goals or performance measures (Russell & Norvig, 2010).

Performance measures are used to quantify how well an agent has done in selecting an action based on its percepts, percept sequence and other knowledge of an environment. In acting rationally, an agent is maximising its expected success of that action in achieving its goals and meeting its performance measures, as it cannot always know the definite outcome of an action in any given state (Beckert, 2004; Alvarado et al., 2017).

Performance measures are important in assisting an agent to act in the best way possible given the information it has available, as, in reality, an agent will seldom, if ever, have full knowledge of the outcomes of its actions in all states. Therefore, it is best for it to act rationally and thereby maximise its expected performance, rather than rely on omniscience (knowing the actual outcome for each action) (Russell & Norvig, 2010).

22 Before further agent designs and types can be discussed, it is important to first consider the task environments in which the intelligent agents will exist, as this determines what percepts, actions and knowledge the agents have and, ultimately, how performance measures can be defined.

3.1.1 The task environment To describe the task environment of any given agent, the PEAS description is used, where each letter of the acronym represents a different area that needs to be defined (Russel & Norvig, 2010; Weiss, 2013). The four areas of the PEAS description are as follows:

 Performance (measure): what is the performance measure to which the agent must aspire?  Environment: what is the environment in which the agent will exist? What will it face in this environment?  Actuators: what actuators does the agent possess to act on the environment? How can it make use of its actuators?  Sensors: what sensors does the agent possess to perceive the environment? What metrics or information can the sensors obtain from the environment?

The next step before moving on to agent design is determining the properties of the task environment. The PEAS description provides useful information in determining what will be at the disposal of the agent, how it will operate and what its goal is. The properties of the task environment provide a deeper understanding of the task environment so that it can be categorised, making it easier to choose an appropriate agent design (Russel & Norvig, 2010; Wooldridge, 2009). The dimensions that are evaluated as part of the task environment properties are as follows:

 Observable: do the agent’s sensors give it access to the full state of the environment at all points in time? If so, the environment is fully observable and the agent does not need to keep track of the environment state internally. Otherwise, if the environment state is not accessible to the agent in its entirety, the environment is partially observable. An environment is unobservable if the agent has no sensors at all.  Agents: how many different agent types exist in the environment, and what is the population of each? If it is only the agent being designed, then it is a single agent environment. If other agents exist (being agents that are acting in a manner to maximise their performance measure, the value of which is also affected by the other agents), then the environment is classed as a multi-agent environment. Multi- agent environments or systems can be competitive, cooperative or partially one or the other. This will be covered in greater depth in section 3.2 of this chapter.  Deterministic: how do the agent’s actions affect the environment? If the action executed by the agent and the current state of the environment completely

23 determine the next state, then the environment is deemed deterministic. If there is uncertainty around outcomes, then, depending if there are probabilities attached to the outcomes or not, the environment is stochastic or non-deterministic, respectively.  Episodic: do the actions of an agent in an episode affect all further states? If the next episode does not depend on the actions taken in previous episodes, then the environment is episodic. However, if all previous actions effect the environment state, then the environment is sequential.  Static: does the environment change while the agent processes its percepts to determine an action to take? If so, the environment is dynamic and the environment continually changes. In this situation, the agent deciding what to do is taken as inaction until it acts. In static environments, the environment waits for the agent to deliberate and act before changing. Semi-dynamic environments are environments in which the environment doesn’t change while waiting for the agent; however, the agent’s performance score can change.  Discrete: how is time handled in the environment with regard to state and the agent’s percepts and actions? Discrete environments have well-defined finite states, whereas continuous environments do not. They move smoothly though a range of states or values over time.  Known: this dimension does not really apply to the properties of the task environment but, rather, to the agent’s knowledge of the environment. In a known environment, the outcomes for all actions are known by the agent, whereas in an unknown environment, they are not, and the agent will need to learn them as it goes so as to make good, rational decisions.

After defining and evaluating the above dimensions regarding the task environment for an agent, a clearer understanding of how the environment in which the agent will exist functions is created. This will directly affect the next step of the intelligent agent design process, as an understanding of what information is available and what constraints the agents has is formed, allowing for an informed decision to be made on the type of agent program to make use of.

The next subsection will focus specifically on the different agent program types and their structure, strengths and ideal applications.

3.1.2 Types of agent programs When considering an intelligent agent’s design on a higher level, it is made up of two key components:

 Architecture: the physical hardware that the agent will run on, including the sensors and actuators, as well as the computing device. These make it possible for the agent

24 to function in a practical manner, moving beyond the purely theoretical design of the agent (Wooldridge, 2009; Albrecht & Stone, 2018).  Program: the software that will run on the architecture to ultimately implement the mapping of percepts to actions (the agent function) (Wooldridge, 2009).

This section focuses on the agent program portion of the intelligent agent’s design. It defines four basic program types, which cover the majority of agent program designs. It also discusses how they can be altered to become learning agents capable of choosing better actions through a process of improving the components of their agent programs.

The four types of agent programs that will be covered in more detail below are (Russell & Norvig, 2010; Alvarado et al., 2017):

 simple reflex agents;  model-based reflex agents;  goal-based agents; and  utility-based agents.

The general assumption that will be used is that the same skeleton is used for each of the agent program types, whereby input from the sensor is used as the current percept and an action is returned to be performed by the actuators.

The first type of agent program that will be considered is the simple reflex agent, which is the most basic, and, true to its name, the simplest of the agent program types.

3.1.2.1 Simple reflex agent The most basic of the agent program types is the simple reflex agent. This agent program only considers the current percept at any point in time when deciding on an action to perform. It does not consider the percept history or any knowledge of the environment and makes use of a very simple mapping of percept to action, called a condition-action rule (Russell & Norvig, 2010; Albrecht & Stone, 2018).

The condition-action rules that govern the behaviour of this type of agent program make use of simple logical structures, such as nested if statements or switch statements. This ultimately limits the intelligence of these agents, as without a fully observable environment, they struggle to perform well and are of little use. Figure 3.1 below shows the structure of a simple reflex agent.

25 Sensors Agent

What the world is Environment like now

What action I Condition-action rules should do now

Actuators

Figure 3.1 Simple reflex agent program structure schematic (Russell & Norvig, 2010).

The section above makes it clear that when it comes to intelligent agents, simplicity can be useful in terms of mechanics but is of limited use. The next section considers a more advanced agent program design, which keeps track of its environment.

3.1.2.2 Model-based reflex agent The previous section made it clear that only considering the current percept and not having any knowledge or understanding of the agent’s environment leads to a simple agent with limited use, especially if the environment is not fully observable. It is intuitive then, in the design of a smarter agent, to consider maintaining some form of internal state dependant on the agent’s percept history and not only the current percept. In doing so, the agent can keep track of some of the current state’s unobservable aspects (Russell & Norvig, 2010; Garro et al., 2018).

For the agent to maintain an accurate and useful internal state, it requires having knowledge regarding two important aspects of the environment:

 the evolution of the environment, independent of the agent and its actions; and  the affect that the agent’s actions have on the environment.

Together, knowledge of the above two aspects forms a model of the environment, which helps the agent to make better, more intelligent decisions based on its percepts. This type of agent program is, therefore, aptly named a model-based reflex agent, as it makes use of the internal model it maintains to make better decisions. However, it still acts in a very reflex-based manner, with simple condition-action rules driving its actions (Russell & Norvig, 2010). It ultimately has a better understanding of the environment and the current state but

26 still makes use of simple percept-to-action mapping. Figure 3.2 illustrates the model-based reflex agent below.

Sensors Agent State

What the world is Environment How the world evolves like now

What my actions do

What action I Condition-action rules should do now

Actuators

Figure 3.2 The model-based reflex agent program structure schematic (Russell & Norvig, 2010).

The model-based reflex agent program design improved upon that of the simple reflex agent. However, it is still of limited intelligence and use due to the lack of a more advanced decision-making process based on its understanding of its environment. The next section considers an approach where the condition-action rule is replaced with another method for deciding on the best action to take.

3.1.2.3 Goal-based agent Taking into consideration the previous two agent program designs, it is apparent that having a better understanding of the environment is not sufficient for an agent to act intelligently. A more sophisticated method by which the agent makes decisions i.e. a more sophisticated method for mapping percepts to actions is needed.

Most intelligent agents are created with achieving some type of goal in mind, whether it be to reach a specific destination or to acquire some item. Making decisions based on achieving this goal makes more sense than taking an action purely due to the percepts and knowledge that the agent has (Russell & Norvig, 2010; Garro et al., 2018). Goal-based agents still maintain an internal model of the environment to understand the current state as best as possible. However, instead of making use of a simple condition-action rule set, they possess goal information, which describes desirable situations for the agent to be in.

When the agent makes use of goal information to drive its decision making, it involves another aspect of the environment model not previously used: consideration of the future.

27 The consideration of the future is generally expressed in two ways:

 What will achieve the agent’s goal?  What will happen if the agent performs various actions?

By considering these aspects, the agent can make decisions that bring it closer to its goal and that it understands the effect of (to some extent). The goal-based agent is represented in Figure 3.3.

Sensors Agent State

What the world is Environment How the world evolves like now

What my actions do What it would be like if I do action A

What action I Goals should do now

Actuators

Figure 3.3 Goal-based agent program structure schematic (Russell & Norvig, 2010).

In this section, a more intelligent and flexible agent program design was discussed, which makes use of goals to drive the agent’s behaviour. However, relying on goal information alone is not ideal, as it may lead to inefficient behaviour of the agent. The next section considers how goal information can be improved upon in order to achieve more intelligent behaviour.

3.1.2.4 Utility-based agent Making use of goals to drive intelligent behaviour in agents was a vast improvement over the reflex-based agents. However, achieving goals isn’t always as simple as a binary distinction of ‘achieved’/ ‘not achieved’, as there are many more degrees of comparison. This is where the concept of utility comes in, where utility represents the degree of ‘happiness’ or ‘achievement’ (Garro et al., 2018).

Utility-based agents, once again, still maintain an internal model of their environment but make use of a utility function instead of using only goals to decide on what action to perform. The agent’s utility function is an internalisation of the performance measure, allowing it to score various sequences of environment states and determine the sequence

28 that is most desirable (has the best utility) (Russell & Norvig, 2010; Albrecht & Stone, 2018). This allows for the agent to act in a rational manner and in such a way as to maximise the expected utility of various action outcomes. Figure 3.4 below depicts the utility-based agent program design.

Sensors Agent State What the world is

like now Environment How the world evolves What it would be like if I What my actions do do action A

How happy will I be in Utility such a state

What action I should do now Actuators

Figure 3.4 Utility-based agent program structure schematic (Russell & Norvig, 2010).

In this section and the preceding three sections, agents’ program types have been defined based on rules, knowledge and algorithmic measurements of ‘happiness’. However, depending on the environment the agent operates in and the properties thereof, the agents may fall short in acting rationally and achieving their goals (Qiu & Li, 2016). The next section considers a general structure for creating an agent capable of learning and adapting to the environment and changes.

3.1.2.5 Learning agent The difficulty of explicitly designing and creating agents that can act rationally when their task environment is unknown is exceptionally high and it would ultimately be ineffective. Alan Turing faced this exact problem and proposed the building of learning machines capable of operating in an environment with little to no knowledge of the environment but with the ability to learn and become even more competent than was initially possible (Russell & Norvig, 2010; Alvarado et al., 2017).

Learning agents are generally made up of four distinct components (Garro et al., 2018):

 the critic, which is responsible for providing feedback to the learning element on how the agent is doing in relation to a fixed performance standard;

29  the learning element, which is responsible for making improvements to the performance element based on the critic’s feedback and deriving knowledge from it;  the problem generator, which is responsible for recommending actions to the performance element that may lead to new experiences that can improve the understanding of the environment and the achievement of utility; and  the performance element, which is responsible for taking in percepts and deciding on actions to take based on the input from the learning element and problem generator.

Each component serves a specialised function, with the performance element encapsulating the entirety of what was previously considered the agent program in the earlier designs. Together, the above four components create an agent that is capable of taking in percepts, maintaining an internal model of the environment, measuring the success of itself against a performance standard, adapting its decision making on this, and exploring alternative solutions in order to learn more about the environment and, ultimately, how to achieve its goals (Shoham & Leyton-Brown, 2008; Qiu & Li, 2016). A general learning agent program design is represented below, in Figure 3.5.

Performance standard

Critic Sensors Environme feedback changes Learning Performance element element knowledge learning goals nt

Problem generator

Agent Actuators

Figure 3.5 Learning agent program structure schematic (Russell & Norvig, 2010).

The sub-section above considered agent program types and outlined the most common types that are used in the creation of intelligent rational agents depending on the environment and architecture that the agents must function within (Russell & Norvig, 2010; Alvarado et al., 2017). The choice of agent program type to make use of depends heavily on

30 the expected task environment and the agent’s goals. While it would seem obvious to always implement the ‘most intelligent’ agent program, in reality, a simple solution would be more time- and cost-effective.

Section 3.1 provided a definition of intelligent agents, what it is to act rationally, what the task environment of an agent is, how to describe its properties and what impact these have on the agent’s ability to operate. The various types of agent program design were described, highlighting their functioning and information needed to do so. Up to this point, only a single agent has been considered. However, when describing the task environment, a property that needs to be defined is the number of agents in an environment and whether they are competitive or cooperative. When dealing with multiple agents in an environment, the dynamics of the environment change and the manner in which a single agent acts must change as well, in order to take this into account.

In the following section, section 3.2, multi-agent systems are explored, taking into account cooperative and competitive agents and how these affects the design of singular agents as well as the system as a whole.

3.2 Multi-agent systems In Section 3.1 intelligent agents were introduced as being software programs capable of autonomous action, which follow rules and try to achieve a goal. This is a very broad description of agents, as there are many differences in agents, depending on the agent program design and physical architecture.

Previously, agents were only described in a stand-alone context, with the number of agents only being considered as part of the task environment’s properties. The number of agents that exist in any given environment, along with their goal alignment, is, however, an extremely important part of intelligent agent design and implementation. There are a multitude of environments for which a single agent is ideal or at least sufficient. However, there are as many (if not more) environments that require multiple agents or, by nature of the environment, will contain multiple agents over which the agent design has no control (Wooldridge, 2009; Victor et al., 2018).

When considering multi-agent systems (MASs), they can be categorised into two high-level groups:

 cooperative (also known as collaborative); and  competitive.

The base distinction between these two categories of MASs lies in the alignment of the agents’ goals that exist in the system. In cooperative MASs, agents have goals that, even though they may not be the same, contribute to solving a common problem or task. In this

31 scenario, agents may collaborate to achieve this goal or act on the environment in a way that helps to support the overall goal (Brazier et al., 1999; Halinka et al., 2015).

Competitive MASs work inversely to cooperative MASs in that the agents that constitute the system do not share a common goal and are, instead, competing for resources. The goals of individual agents in this type of MAS may or may not be in direct competition with those of the other agents and, whether they directly (or perhaps intentionally) or indirectly compete within the environment, the design of these agents must take this into consideration (Shoham & Leyton-Brown, 2008; Lu & Liu, 2019).

For each of the above-mentioned MAS categories, specific factors need to be considered when designing the agents that will operate in them. The following two subsections, 3.2.1 and 3.2.2, will take a more in-depth look at each category, considering various scenarios, design considerations and techniques for creating effective agents.

3.2.1 Cooperative multi-agent systems When considering the use of MASs to solve various problems or accomplish tasks, it often seems obvious to make use of a central controller to instruct the agents in performing their actions and achieving the overall objective. In this case, the various agents could be viewed as being merely sensors and actuators that are used by the central controller to carry out its design goal. This creates a scenario where the agents are not really intelligent agents, as they lack autonomy to act on their own and pursue individual goals that contribute to the larger overall goal (Wooldridge, 2009; Patel & Mehta, 2019). This type of system has its strengths and applications; however, it is limited by the constraints of the central controller: whether it is physically or logically feasible (large distributed systems) and whether it is powerful enough to handle computing the actions, etc. of all endpoints.

A more common approach to use in scenarios where there are distributed resources is to make use of intelligent agents in a cooperative manner, whereby the computational and physical resources are distributed and the agents can act autonomously, taking into account their own goals, the environment state and the other agents when choosing to act (Shoham & Leyton-Brown, 2008; Victor et al., 2018). Realistically, in these scenarios, each agent or node has limited computational resources, communication ability/bandwidth and power supply. Therefore, it is most effective for each individual to be focussed on a specific task or set of tasks that align with those of the other agents in order to accomplish the goal (Wooldridge, 2009; Lu & Liu, 2019). In doing so, the distributed algorithm that ultimately controls the MAS, as a whole, needs to be carefully designed to function locally on a single agent, while still taking into consideration global properties or states.

Two important classes of high-level distributed algorithms are (Victor et al., 2018):

 distributed constraint satisfaction; and  distributed optimisation.

32 These two types of distributed algorithm function in very similar fashions, differing only slightly from one another in how optimal the resultant solutions are. Each one will be described in further detail below, beginning with distributed constraint satisfaction.

3.2.1.1 Distributed constraint satisfaction When making use of an MAS in a cooperative manner, it is important that each of the individual agents have a task to carry out or a goal to accomplish (or both). Ultimately, in splitting up the achievement of the greater goal of the MAS into smaller tasks and goals, a level of simplicity is gained, whereby each agent is primarily concerned with its own objectives and not those of the other agents (at least in the case where the objectives do not clash) (Wooldridge, 2009; Alvarado et al., 2017).

When translating this into an algorithm employed by all agents, it is useful to consider the problem as a whole in terms of it being a constraint satisfaction problem (CSP). CSPs are problems where there exists a set of variables (with each variable existing in some domain) and constraints on the values that the variables may possess simultaneously. In solving a CSP, a constraint satisfaction algorithm is created, which attempts to assign values to the various variables involved in such a way that all constraints are met. In doing so, it finds a solution or determines that no solution exists (Weiss, 2013; El-Taweel & Farag, 2018).

In a cooperative MAS, the CSP becomes a distributed CSP in which each variable in the problem belongs to a different agent. This allows for a solution to the problem to be found not by employing a central controller to determine the value of each variable but, rather, through the agents choosing values for their given variables with autonomy. In the process of choosing values for their given variables, the distributed CSP algorithm makes the agents perform computations local to themselves and their current state, while also communicating with neighbouring agents, attempting to find a solution to the problem and to do so quickly (Weiss, 2013; Cui et al., 2017).

There are numerous well-known existing algorithm types for this purpose, including:

 Domain Pruning Algorithms: where inconsistent values are eliminated from the local domain of nodes through nodes in the domain communicating with their neighbours. Each sub-domain in the tree prunes out local values iteratively until no further eliminations can occur.  Heuristic Search Algorithms: exhaustive search algorithms where ordering of nodes is performed. Thereafter, for each node, each value in its domain is checked for consistency. This only achieves distribution in the sense of sequential execution of value comparison by each node (or agent).

The details around each algorithm type is outside the scope of this chapter, as it serves merely to describe the MAS and possible configurations, not to go into detail around specific algorithms for implementation.

33 Distributed CSPs allow for an MAS to find a solution to a given problem if one exists. However, there is no guarantee of the optimality of the solution found. When finding optimal solutions is of great importance to the MAS, distributed optimisation algorithms are needed. The following sub-section describes this type of cooperative agent algorithm design (Wooldridge, 2009; Alvarado et al., 2017).

3.2.1.2 Distributed optimisation Distributed CSPs were described above, where agents, along with an algorithm, are used to meet global constraints on a set of variables in a distributed fashion so that no central controller is needed.

To find optimal solutions, the agents need to act autonomously when attempting to optimise a global objective function. The biggest obstacle in finding an optimal solution is the number of variables, environment states, or paths that need to be considered, as in order to guarantee the optimal solution, all possibilities must be considered (Weiss, 2013; El-Taweel & Farag, 2018). For some MASs this is feasible as there is a finite and manageable number of states. However, many MASs have too many possibilities to explore, and this is where approaches such as dynamic programming (which is generally well suited to path planning problems) can be applied (Wooldridge, 2009; Cui et al., 2017).

Dynamic programming approaches make use of heuristics to help them deal with unfeasibly large numbers of possibilities, allowing them to function effectively with a smaller number of agents. Asynchronous dynamic programming and learning real-time A*1 (LRTA*) do this by enumerating and calculating from each agent in the MAS, with the latter using a shared table for computed values that all agents can access, allowing them to learn from each other and, ultimately, converge on an optimal solution much faster (Shoham & Leyton- Brown, 2008; Victor et al., 2018).

Contract nets form a distributed optimisation algorithm class that makes use of contracts, bidding and negotiations to optimally distribute contracts between a number of agents in an MAS. In doing so, an optimal sequence of contracts can be achieved based on a cost function used to determine which agent gets allocated a task. For this to work, negotiation schemes that consist of a bidding rule (ways in which agents can make offers), market clearing rule (how outcomes are determined based on the offers), and an information dissemination rule (what information is made available to agents in the negotiation process) (Wooldridge, 2009; El-Taweel & Farag, 2018) need to be used. All of these elements make contract nets specifically suited for MASs where resources are finite or scarce and thus need

1 A* is a search algorithm popular for graph traversal and path finding problems. It is a ‘smarter’ search algorithm, where heuristics are used to approximate the value of outcomes, allowing the algorithm to make better decisions. It weighs the cost of a possible next choice with the weight of reaching the end destination from that next node, ultimately picking the option with the lowest cost.

34 to be used wisely. The winning of an auction here is not for personal gain for a particular agent, but in assigning out the winning bids, a near optimal solution can be found.

There are many other distributed optimisation techniques that exist, and this section is not meant to be an exhaustive list and description thereof but, rather, serves to provide a high- level understanding of the purpose and use of these techniques (Shoham & Leyton-Brown, 2008)(Cui et al., 2017).

MASs can be designed in many different ways and in an ideal scenario are limited to being cooperative environments, where all agents work towards a shared higher goal. However, in reality, it is most likely that there will be agents that exist in the MAS that do not share the same goals and may even have goals in direct opposition to those of other agents. Competitive MASs are discussed further in the following section.

3.2.2 Competitive multi-agent systems In most realistic MAS environments, there will exist a number of agents that have goals that aren’t aligned with the objectives and goals of the other agents. These agents may be in direct competition with each other or may simply be going about their business, attempting to achieve their own goals. However, this distinction does not change the situation in which agents are in competition with one another (Weiss, 2013). This is especially true in the case where there are limited resources available in the environment that are needed by agents to complete their given tasks.

To describe this situation and the types of agents involved, the term ‘self-interested agents’ is used, the definition of which ultimately boils down to the notion that each individual agent has its own idea of which environment states in the MAS it likes. The states that it likes are not necessarily states in which other agents are worse off; they may include favourable situation for the other agents. Regardless of the outcomes or implications of these liked states, the agent will act in a manner such as to achieve or bring forth the states it likes (Wooldridge, 2009; El-Taweel & Farag, 2018).

A popular method of getting agents to act in a way that favours their liked states is to make use of utility theory. The subsection below describes utility theory and how it is applied in MASs.

3.2.2.1 Utility theory When describing the situation of self-interested agents in the section above, it was said that self-interested agents are agents that have environmental states that they favour and that will act in such a way as to bring about these states. Utility theory is the theoretical approach for attempting to quantify the degree of preference an agent has for certain environmental states over other alternatives (Shoham & Leyton-Brown, 2008). In doing so, it is believed that the agent’s beliefs/desires can be modelled and an understanding can be created of how an agent will act when there is uncertainty of the outcomes of its actions in bringing about its liked states (Weiss, 2013).

35 The main component that an agent possesses in utility theory is called its ‘utility function’ and is simply a mapping of an agent’s quantified ‘happiness’ in a given state for the various states of its environment. If the agent faces uncertainty, then its utility is the expected utility taking into account the distributed probability of various states (Wooldridge, 2009; Alvarado et al., 2017). When designing agents for competitive MASs, it is important to plan and design the utility function of any given agent to also take into account the other agents in the environment. Adding penalties or incentives for states that include certain agents or agent types allows the designed agent to learn to favour or avoid certain states.

The idea that an agent should act purely based on an algorithm that quantifies the ‘happiness’ of the agent for a given state, or the expected ‘happiness’ when uncertainty exists, can seem questionable. For this reason, von Neumann and Morgenstern proposed the grounding of utility in a more basic concept: preferences (Shoham & Leyton-Brown, 2008; Victor et al., 2018). Their theory (von Neumann-Morgenstern utility functions) sets out to provide a theorem and proofs for the effectiveness of agents having preferences for certain states over others and introduces the concept of lotteries, which are simply a random selection of one state from a set, based on specified probabilities. The von Neumann-Morgenstern utility function theorems and proofs are very in-depth and lengthy and are, therefore, excluded from this thesis. However, they do serve as a basis for the use of preferences with regard to utility (Shoham & Leyton-Brown, 2008; Cui et al., 2017).

3.2.2.2 Game theory Another popular method used for the modelling of intelligent agents interactions is game theory. At its core, game theory is about the study of optimising agents in a mathematical manner (Zhang & Shang, 2016).

Game theory first took shape in the 1700s in relation to two-person games such as cards, chess and others. Minmax2 mixed strategy solutions were developed to optimise game play in a theoretical manner (Newton, 2018). Over the years game theory enjoyed further development and application, however it was not until the 1950s that extensive development took place (Arzhakov, 2018). At this stage game theory was starting to be applied not only to competitive games but to philosophy and politics.

Game theory grew to include many types of games, each with their own strategies and solutions, including (Arzhakov, 2018):

 Cooperative/non-cooperative: cooperative games are characterised by players being able to form alliances through contract law, working together for a common goal. Non-cooperative games on the other hand do not allow alliance or have alliance that are self-enforced.

2 Minmax is a strategic rule definition for minimising the possible loss for a worst case outcome and is widely used in AI, statistics, philosophy and game theory.

36  Simultaneous/sequential: in simultaneous games, competing players are able to move or take action at the same time, whereas in sequential games players take turns to act.  Symmetric/asymmetric: in symmetric games the outcomes of strategies and actions are the same for all players; everyone is on equal footing. In asymmetric games however, outcomes may differ per player; players stand on unequal footing.  Zero-sum/non-zero-sum: zero-sum games are characterised by the total benefit for all players always adding up to zero; the gain of one player(s) is at the equal expense of another player(s). Conversely, in non-zero-sum games the total benefit to all player can be greater or less than zero.

To enhance the effectiveness of the study of game theory, game representations need to be used which mathematically describe the game. Game representation generally describe the players of the game, the information available to players at each decision point, the actions available to players at each decision point and the payoffs for each outcome (Newton, 2018).

Payoff matrices is a method that was devised to visually represent the outcomes of decisions in a game. At any point in a game, the matrix is generally sized m x n, where player A has m possible moves and player B has n possible moves, providing the outcomes for all possible combinations of player actions (Newton, 2018).

The topic of multi-agent systems is a broad topic that, on its own, can be covered in depth through many textbook volumes, focussing in more detail on topics such as game theory, game representations, communication, learning, resource allocation and more.

The purpose of this section is to give a broad overview of what MASs are and develop an understanding of what they are composed of, the different types of MASs and their use in the real world. Thus far in chapter 3, the topics of intelligent agents and multi-agent systems have been covered, providing definitions of both and exploring their properties and applications. In doing so, it has been shown how static designs and representations of intelligent agents and MASs can be achieved. What is yet to be discussed is how either of these entities can or will change over time and how to design them in such a manner (Shoham & Leyton-Brown, 2008; Alvarado et al., 2017).

Evolution is a well-publicised topic in the natural world, where the evolution and change of biological entities over time is studied, modelled and understood, but it is less widely understood for digital entities. Do digital entities evolve, and, if so, how is this achieved? Appendix A provides further information on the topic of evolution in a digital context, focussing specifically on the evolution of intelligent agents.

37 3.3 Conclusion Chapter 3 focussed on the topic of intelligent agents, describing the components of agents, the various basic design types and what their uses are. The task environment that agents operate in was defined and the properties that are used to describe task environments were stated, showing how the different properties affect the agents that operate within the environment.

Thereafter, multi-agent systems were defined, considering the two main types of MAS: cooperative multi-agent systems and competitive multi-agent systems. The specific factors that need to be considered for each were covered, along with strategies for designing singular agents that are able to excel in the given MAS environments.

Lastly, the topic of evolution was looked at in the context of evolutionary computation, applying natural evolutionary theories to computational models. It was discussed that evolutionary computation strategies and the resultant algorithms are well suited for problem solving in many domains, and that genetic algorithms, a simple and widely used type of evolutionary algorithm, have strengths in search, optimisation and learning problems.

The information presented in this chapter is of relevance to the following secondary research questions:

SRQ2: Can an MAS be used to simulate the various components of a digital environment?

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

The chapter has clearly defined intelligent agents, the types of agents, their task environments, and considerations that need to be made when dealing with an MAS. It has been shown that MASs can be created with many differing agent types and other environmental properties, allowing for an analogous mapping of components of a digital environment into the MAS. This clearly answers SRQ2, creating a basis from which the digital environment MAS can be designed for the purposes of this thesis.

The chapter also covered evolution in the context of evolutionary computation, showing how evolutionary principles can be applied to real world problems across various domains. Evolutionary algorithms allow for the encoding of a problem space or entities into a representation that allows for the evolutionary process to be applied in changing a population over generations. Heuristics can be applied, along with the fitness function of algorithms such as GAs, to control how the evolution of the population occurs. SRQ4 is, therefore, answered with regard to AI for evolution, with the machine learning (ML) perspective to be covered in the next chapter.

38 Intelligent agents, their tasks environments, and how they can coexist in a multi-agent system have been covered in this chapter, with some reference being made to learning. Learning can be achieved by learning agents using various approaches and heuristics, and some evolutionary approaches, like genetic algorithms, can also be optimised for the application of learning within a given problem space. The next chapter will take an in-depth look at a sub- field of AI, called machine learning (ML), that is currently receiving a lot of attention worldwide and is being applied to solve many problems that were previously unsolvable.

39

40 4 Machine learning Since the dawn of modern computing, people have considered the possibility of creating intelligent machines capable of solving complex problems. The idea of intelligence in the context of machines has been pondered for well over a hundred years, with inventors of the time trying to conceptualise how it could be achieved. Many intellectuals and academics theorised on how mathematical principles could be applied for the purpose of creating intelligent3 machines, devising theorems and models that seemed to hold promise. The biggest issue at the time was the lack of technological advancements to make their theories a reality, and so they remained simply as theories for decades. With the advent of modern computers and the ability to program them, people were finally able to apply computational approaches to solving problems that were, intellectually, too difficult or complex previously. These problems generally involved heavy mathematical computation and evaluation based on defined rule sets which would take humans exponentially longer to do by hand but could now be solved in a matter of seconds or minutes (Shai & Shai, 2014). This is how the field of artificial intelligence came about, starting with using computers to solve labour-intensive, often repetitive tasks that humans were completely inefficient at due to the stamina of the workers and the intellectually demanding problems. However, even though solving intellectually challenging problems based in rule sets became straightforward through the use of computers, tasks that humans found relatively easy or simple were found to be very challenging for computers to deal with. This was mostly due to the abstract nature of these tasks and the difficulty of defining clear rule sets. These tasks included identification of objects or sounds, recognising people, understanding language, classifying images and many more (Pahwa & Agarwal, 2019; Smola & Vishwanathan, 2008). The lack of formal rules that can be applied in solving these problems caused researchers to try to find a way to do so, which lead to the creation of knowledge base approaches, where AI systems had pre-existing knowledge of the problem and various factors regarding it built in or hard coded. This required defining the knowledge in a formal language that could be interpreted by the computer system as structured data with a clear hierarchy and would be very specific to a given problem (Shai & Shai, 2014). This approach had a significant effect in problem spaces with well-defined rules and finite states. However, it failed drastically in more abstract scenarios where more and more knowledge had to be hard coded, and systems were still ultimately limited to the knowledge they possessed; they could not discover and create their own new knowledge from experiences. It was out of this need for AI systems to be able to acquire new knowledge and learn from experiences to solve problems or perform tasks that machine learning was born (Goodfellow et al., 2016).

3 Intelligence in this context can be defined as the ability to make decisions based on input information and other pre-existing knowledge.

41 Machine learning (ML) is the ability of an AI system to extract patterns from raw data in order to build new knowledge that it did not previously have hard coded into it (Shai & Shai, 2014). In doing so, ML allows a computer to be used with regard to a particular problem to make subjective decisions based on the knowledge it has developed, thus addressing the problem early AI and knowledge base systems had. In a later section of this chapter, ML models will be looked at in greater detail, describing the different types, their function and their use. A definition of ‘learning’ in the context of ML is presented in section 4.1. Symbolic reasoning, also known in practice as rules engines, expert systems or knowledge graphs, was the dominant AI paradigm up until the 1980s. It is based on the use of primitive operations and mathematical logic (first order logic; sometimes higher orders) to perform reasoning (Tenenbaum et al., 2018). This approach has been successfully applied to solve open mathematics questions and perform large scale automation of financial and tax calculations among others (Landy et al., 2014). The approach is limited by its reliance on a set of hard coded rules that are defined by humans and therefore limited by human understanding of the problem. In contrast ML is designed to discover rules through correlation of input and output data; ultimately developing knowledge. Even though ML approaches manage to address the problems of knowledge acquisition and world description that knowledge base approaches suffer from, they are still limited, like most computer and AI systems are, by the representation used for the data fed to them (Smola & Vishwanathan, 2008; Ray, 2019). The ML models are not able to automatically understand abstract raw data fed into them and how best to make use of it. Rather, they require a structured representation of the data in which each piece of important information regarding the problem is included. Each individual piece of important information required for solving the problem is called a feature, and ultimately various ML models work by learning how to correlate a given set of features with outcomes (Goodfellow et al., 2016). For many ML tasks the selection of the correct features to make use of can be simple due to how obviously important certain features are to a given domain. However, there are also many tasks for which feature selection isn’t very clear, such as object recognition in images, due to the difficulty in describing a specific object with selected features (especially when taking into account variations in the object, lighting and surroundings) (Pahwa & Agarwal, 2019; Smola & Vishwanathan, 2008). To solve this problem, researchers realised they could use ML to enable machines to discover the representations themselves, creating a new approach called representation learning, in which more effective representations are discovered than what researchers could hand craft. The representations learnt through ML are generally of a better quality than those that can be defined by humans, and are learnt in a considerably shorter time, ranging from seconds to minutes for simple tasks and hours to possibly months for complex tasks. One example of a representation learning algorithm is the autoencoder, which encodes the raw data into representations, and then decodes it back from these representations into data, while preserving as much information as possible (Goodfellow et al., 2016). Even though representation learning methods make it easy to derive effective data representations for

42 use as features, there still exists a problem in the factors of variation that can affect the resultant output, dependent on context. Factors of variation refer to the abstractions or concepts that humans make use of in making sense of information in a given context and are difficult to quantify and, therefore, create representations for. These factors include concepts such as lighting, age, gender, season and more, which in themselves don’t seem particularly important, but consider how these would affect the task of recognising a person or object in an image (Ray, 2019; Shai & Shai, 2014). Extracting these abstract features from data essentially requires human-like understanding of the task and data to disentangle factors of variation that are unimportant to the context of the task. This problem with representation learning therefore limited the ‘intelligence’ of the ML models and they subsequently struggled with certain types of tasks. This weakness in representation learning methods gave rise to the concept of deep learning, which is currently a topic enjoying great attention and research. Deep learning techniques are being applied to many difficult problems that other ML methods were previously unable to address. Deep learning addresses the problem of representing abstract concepts by reducing complex representations into smaller simpler representations. It is then able to learn complex concepts by constructing them out of simple concepts. The topic of deep learning is explained in more detail later in this chapter. 4.1 Machine learning algorithms The beginning of this chapter gave a high-level overview of the field of machine learning, how artificial intelligence came about, how AI methods were adapted to solve various problems that could not be tackled easily before, and some of the stumbling blocks of these methods that lead to the development of the branch of AI that is ML (Shai & Shai, 2014). At a high level, ML was described to be a branch of AI which focused on giving computers the ability to learn from processed data sets to find solutions to problems or perform tasks that they did not necessarily have full knowledge of hard coded into them. ML techniques would allow computer systems to acquire their own knowledge of the world (Goodfellow et al., 2016). Even though this summary of ML is technically correct, it does not give enough detail about the techniques that are used, how learning occurs, or all the terms used to describe ML problems and what these encompass. This subsection will provide richer detail on the field of ML, what constitutes an ML algorithm, what types of algorithms there are, and what applications they have (Sharma & Nandal, 2019; Smola & Vishwanathan, 2008). To begin with, a good technical definition for learning in the context of computers is given by Mitchell (1997): “A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.”

43 In order to understand this fully, it is necessary to consider each component of this definition and describe them in more detail. 4.1.1 The experience When considering the experience that a machine learning algorithm is allowed to have during the learning process, it is important to understand which of the three categories of experiences it is: 1. supervised learning, 2. unsupervised learning, or 3. reinforcement learning.

In order to determine which of the two categories of experience the algorithm will have, the dataset it is exposed to must be taken into consideration. In a supervised learning environment, the learning algorithm experiences a dataset that not only contains a number of data points and features for each one, but also has a label associated with each data point (Ray, 2019; Smola & Vishwanathan, 2008). The label ultimately instructs the algorithm what to do and how to associate input data with output results, as the label tells the algorithm what the result for that data point is. For this reason, this type of approach is referred to as ‘supervised learning’, as the algorithm has a teacher of sorts in the form of the data labels (Shai & Shai, 2014). Conversely, in an unsupervised learning environment, the learning algorithm experiences a dataset made up of data points and their features, but is not given any instruction on how to map these to an output. Unsupervised learning algorithms do not have labels associated with each data point in the data set, and thus need to learn the structure of the data set and the properties of that structure in order to solve a problem (Goodfellow et al., 2016). In a reinforcement learning environment, learning doesn’t occur through the experiences of pre-existing datasets, but rather through the experiencing of outcomes based on actions and scenarios. This approach makes use of the reinforcement of how good an action is in a given situation based on a score of its result. Good results will lead to actions receiving good scores, while bad results will lead to bad scores. The algorithm learns from the scoring of actions and the situation they occurred in, whether the given action was valuable or not, and whether or not to choose it in future (Case, 2019). The experience that the learning algorithm has greatly affects what is possible and how the algorithm will learn. In a later section in this chapter, supervised and unsupervised learning will be covered in further detail. The experience dataset that the algorithm is exposed to is often referred to as the ‘training set’ and is composed of a number of examples that allow the algorithm to learn how to produce outputs. Learning algorithms are designed to solve some problem or answer a question. This is referred to as the ‘task’. Next, the task of the learning algorithm will be covered.

44 4.1.2 The task In the context of machine learning, a task is a resultant action that the program must take based on its learning process. The learning in itself is not the task but is, rather, the ‘how’ in performing that task. For example, a learning algorithm designed to classify whether an image contains a car or not makes use of learning in terms of how to identify a car in an image, but ultimately the resultant task it must perform is to classify the image (Goodfellow et al., 2016). Learning can therefore be seen as the process that gives the program the ability to perform the given task, as without the learning process the program would not know how to carry out the task (Shai & Shai, 2014). Tasks in ML generally involve the processing of an example dataset/data point which contains a number of features that quantify properties of the data or problem at hand (Shi et al., 2017; Smola & Vishwanathan, 2008). Depending on the features available and the output expected, learning algorithms can be classed based on the type of task they are solving. Some common learning tasks are:

 Clustering: Clustering tasks involve the organisation of data based on the characteristics of its entities. This involves grouping entities that have similarities together, while other groups can be viewed as being dissimilar. The clustering of entities into groups can be performed based on a number of metrics regarding the properties of the entities and the representations thereof (Ray, 2019).  Regression: Regression tasks involve the prediction of some numeric value based on a given input (usually a time series input). Regression tasks often include the prediction of a value over time, for instance financial market prices, average temperatures, sales forecasts, etc (Pahwa & Agarwal, 2019).  Classification: In classification tasks, the aim is to assign categories to a given input. There are usually multiple categories and the best fit category is chosen as the result (i.e. categorising an object as a car, boat, or airplane). However, there are variations where an input can be assigned multiple categories (i.e. categorising an object based on properties like green, round, large). For complex tasks, more modern approaches to object recognition such as deep learning are used due to the complexity of representation (Ray, 2019).  Anomaly Detection: Anomaly detection tasks are ultimately specialised classification tasks where the goal is to determine from a set of inputs which are unusual. A simple example would be to flag all the images in a set that contain a car instead of a human (when the set generally consists of pictures of people). A common use for this is fraud detection based on transaction history, which is widely used by banks and online retailers (Shi et al., 2017).  Translation: Translation tasks include the processing of input characters of one type or language and converting them into another type or language. This classically involves translation of natural language but may also be applied to programing languages or other symbol-sequence-based-languages (Shi et al., 2017).

45 There are numerous other learning tasks and variations of the above-listed tasks with input restrictions or alterations to how results are presented. For a learning algorithm to learn to perform a task well, it needs to be able to measure, in a quantifiable manner, how well it is performing. This is where the performance measure comes into play. 4.1.3 The performance measure In order to understand how well an algorithm is doing at learning the features and outputs for a given problem, it is important that an objective measure of how well it is performing is defined. The measure needs to be quantitative and is generally specific to the task that the algorithm is attempting to perform. This is called the performance measure of the learning algorithm and can be defined in many ways, depending on the purpose of the program (Ray, 2019; Smola & Vishwanathan, 2008). But how do you measure the performance of an algorithm before it is deployed to actually carry out its given task? The method generally used here is to take the input data used for learning and create a segment of this to test out the results on. The algorithm will make use of the available features of the input data and determine a result, which can then be compared to the real result (label) to determine how well it performs (Goodfellow et al., 2016). Thereafter, depending on the type of task, one or many measures can be used to quantify the performance. Most notably, accuracy and error rates are often used to measure performance, where accuracy measures the proportion of outputs that the algorithm was able to correctly produce in the test set, and error rate measures the proportion of outputs that the algorithm produced incorrectly (Nielsen, 2015). Confusion matrices are a method used to measure the performance of classification tasks, comparing the actual members of classes with the predicted members in a matrix. This makes it simpler to understand the distribution of false positives and false negatives for the model. There are many other performance measures that are used; however, these three form the basis of how most algorithms are measured. When considering the ability of a learning algorithm to effectively perform a task based on the input data it receives, it is important to understand the applicability of the algorithm when presented with new and different data points for input. If an algorithm could only reliably produce a result for inputs it has previously seen, then the algorithm would be of little use in the real world for a problem that has a high number of features, and ranges for those features (Pahwa & Agarwal, 2019; Smola & Vishwanathan, 2008). Generalisation is the ability of an algorithm to perform well even when exposed to inputs it has not seen before. This is an important property for a learning algorithm to possess, and for the learning algorithm to perform well in this regard it needs to have a low value for a performance measure called the generalisation error. The generalisation error is simply the expected error value for a new input, and is measured against the test set (which is separate from the training set) to determine the algorithm’s performance (Goodfellow et al., 2016).

46 How well a learning algorithm will perform in a real-life situation is conceptualised by two important challenges: overfitting and underfitting. These challenges relate directly to the training error and test error of the given algorithm, where a sample of each set is chosen and parameters selected to reduce the errors (Nielsen, 2015). Underfitting occurs when the training error for the algorithm is unable to be reduced enough, while overfitting occurs when there is too large a difference between the training error and testing error. Whether an algorithm is more likely to underfit or overfit can be controlled by changing the algorithm’s capacity. Capacity represents how well an algorithm is able to fit a variety of functions that can be selected as solutions. The higher the algorithm’s capacity, the more likely it is to overfit due to it learning specific properties of the training set that do not translate well when used on the test set. Low capacity algorithms, on the other hand, may struggle to fit the training set at all, as there are not enough properties that can be effectively used for learning (Goodfellow et al., 2016). There are various schemes that can be used to help address overfitting and underfitting problems, most of which involve statistical methods for generating or collecting the training and testing sets so as to achieve a better distribution across the problem space. Figure 4.1 Illustrates the relationship between training error, testing error and capacity, illustrating where overfitting and underfitting occur.

Figure 4.1 The relationship between error and capacity, depicting the underfitting and overfitting regimes (Goodfellow et al., 2016).

As was shown earlier in this chapter, there are many different types of ML algorithms that can be used, depending on the task being tackled. One of these learning algorithms is that of deep learning, which, due to its current popularity and use, will be covered separately in the next section.

47 4.2 Neural networks For hundreds of years mankind has been fascinated by their own intelligence and ability to solve problems that other species on the planet cannot. Scientists and philosophers spent many thousands of hours pondering human intelligence, concerned with where it stemmed from and what made humans, as a species, more advanced than other animal life. Besides pure intellectual problem-solving abilities, other qualities such as empathy, morals and ethics were deemed to be something humans had that set them apart. These traits are what separated ‘civilised’ human beings from the ‘savage’ animals in the wild, allowing people to build large communities and move to the top of the food chain (Nielsen, 2015). Through human intellect, physical constraints in the world could be overcome, removing limits that biology placed on their physical bodies. Humans could move objects much heavier than they could carry, remain warm despite a lack of insulation, and grow crops and rear livestock to ensure food supply. It was eventually found that the human brain was the key piece of the puzzle, being larger relative to their body size than that of other species, and with sections that are far more developed. The more developed sections of the human brain would, in more modern times, be found to be sections that are responsible for cognitive ability, emotions, memory and more (Yiu, 2019). In more recent decades the use of magnetic resonance imaging (MRI) technology has given humans greater insight into the brain by being able to build three-dimensional scans that can be analysed and studied. MRI scans coupled with electroencephalography, commonly known as EEG, provided deep insight into how the human brain functions. EEG makes use of numerous electrodes spread over the skull to measure brain activity by measuring the electrical brain waves that occur within. It was found that the brain is a dense biological electrical network, with billions of paths existing between millions of cells in the brain (Goodfellow et al., 2016). By measuring the fluctuations in voltage created by activity within the brain, EEGs could paint a picture of what parts of the brain were used for particular tasks, or when certain emotions occur. Thus, humans gained a greater understanding of the brain and how it functions. The specific cells in the brain that give it its abilities, known as neurons, are cells that exist in the neural system and are responsible for transmitting information to other cells in the body. Neurons send signals to other nerve cells, glands and muscles by generating an electrical charge in the cell body, which it then sends down one of its numerous dendrites (branches that have nerve endings) (Yiu, 2019). The sending and receiving of information occurs when an electrical charge is sent from one neuron’s dendrites to another neuron’s dendrites. This process is known as a synapse. The brain contains hundreds of millions of neurons, firing synapses between them as needed to perform functions in the body.

48 Even though the base function of the brain from a mechanical point of view is now understood, what isn’t very well understood is how the brain performs certain functions that typify human intelligence. One of the main functions of the brain that isn’t fully understood is that of learning. In the meantime, people had begun creating computer systems designed to act intelligently, processing certain inputs to perform an action or provide an output. Computers could be made to perform many tasks that humans found difficult, in an autonomous and efficient manner. However, as previously stated in Chapter 5, it was extremely difficult to create computer systems that could perform simple tasks that humans found to be easy. These tasks generally required some basic learned knowledge on the problem space which did not always consist of hard rules that could be programmed into the computer system (Goodfellow et al., 2016). In an effort to get machines to perform many of these types of tasks, researchers began attempting to mimic how it was believed the human brain functioned. They began developing theories and algorithms based on the interconnected structure of the brain and its neurons, in the hopes that it would result in some level of learning ability. This gave rise to neural networks as a family of ML algorithms, which were not focused purely on mathematics and optimisation, but built around mimicking the function of the human brain. The results were good initially, but it was obvious that technology was still behind from a hardware point of view. This caused stagnation for many decades until modern hardware was able to provide the computational power and paradigms needed. Modern neural networks are a popular field of study and are widely used in many ML-based applications to give computers the ability to learn about a problem and how to solve it without being explicitly programmed to do so and without them being given pre-existing knowledge on the problem space (Nielsen, 2015). How do neural networks actually mimic the human brain, its neurons and synapses? It would appear to be infeasible to attempt to build an exact replica of the brain neuron-by- neuron as there are just too many and the function of each group in the brain is difficult to define. For this reason, it makes more sense to look at modern neural networks as being small subsets of the brain focused on a particular task. Modern neural networks do not attempt to fully model a real brain in order to create a digital human brain (even though that is a subject of much interest), but rather attempt to recreate the manner in which it functions, especially with regard to learning (Goodfellow et al., 2016). When considering a neural network, there are a number of key parts that it is made up of. These are:

 inputs,  neurons,

49  connections, and  outputs.

These four base components can be used to create neural networks of varying size and function. A number of layers of neurons can be used in a neural network, giving it greater ability. Neuron layers will be discussed further later in this chapter. Figure 4.2 illustrates the high-level structure of neural networks.

Figure 4.2 General shallow neural network structure, showing inputs, neurons, connections and outputs (Nielsen, 2015).

When referring to each of the four components listed above, they can be conceptualised as follows (Goodfellow et al., 2016):

 Inputs: These are the inputs that the neural network receives about the problem, and can come from various types of sensors. A single input from a sensor may be further broken down into numerous other inputs for use in the neural network.  Neurons: These represent the base units used in the brain, and are the components of neural networks that perform computations on inputs to create an output.  Connections: These are the connections between different neurons in the neural network, and are, in general, actually the output from one neuron that is used as an input in another.  Outputs: These are the resultant outputs from neurons performing computation on a set of inputs. The final output of the neural network is the solution to the problem or the predicted answer.

50 As with the real human brain, it appears that the power of the neural network lies in its neurons and the interconnected network they form. Neurons are the key component, so it is important to understand what they are in modern neural networks and what they do. As with any algorithm, different versions of neural networks and, in particular, neurons have been created over time. Sections 4.2.1 and 4.2.2 define two of the most well-known and commonly used neurons in modern neural networks: perceptrons and sigmoid neurons. First, perceptrons will be discussed in the next section. 4.2.1 Perceptrons Perceptrons are the result of early work in the field of ML and neural networks, attempting to model neurons from the human brain. Perceptrons were created by Frank Rosenblatt in the 1950s. Rosenblatt was a New York scientist who focused on replicating the biological brain’s function, resulting in the first practical neural network that was able to identify shapes (Nielsen, 2015). Perceptrons were created to simulate what functions neurons in the human brain perform, so as to allow a computer to learn how to perform a task. Perceptrons are simple computational units in a program that take inputs, perform calculations and return an output. The idea was that connecting numerous perceptrons in a network could solve problems that were previously too complex for computer systems (Brownlee, 2019). The base structure of a perceptron consists of four main components (Goodfellow et al., 2016):

 inputs,  weights,  a threshold value, and  an output.

A perceptron takes in a number of binary inputs, each of which can represent a different aspect of a problem, or different features of a single entity. The number of inputs for any perceptron can vary vastly, depending on what the problem is, how many layers exist, and how other perceptrons exist in the neural network (Nielsen, 2015). Weights are a set of real number values that represent how significant the various inputs are to the resultant output. This allows for certain inputs to have a greater bearing on the output of the perceptron individually, or for a group of less significant inputs to outweigh the impact of a single more significant input. A threshold value is used to determine the binary output of the perceptron and is a real number. Based on the value of a weighted sum of the inputs, the threshold value determines what the output should be.

51 The output of the perceptron is the final answer based on the inputs, their weightings, their sum, and the threshold value. The output is a binary value, which essentially indicates true or false (Goodfellow et al., 2016). Figure 4.3 depicts the high-level structure of a perceptron, showing inputs, the body (where computation occurs) and the output.

Figure 4.3 High-level perceptron structure (Nielsen, 2015).

The weighted sum of the inputs that is used is ∑ 푤푗푥푗푗 where 푤 is the weight and 푥 is the input over a set of size 푗. The result of this sum is then compared to the threshold value to determine the output value of 0 or 1. The equation below illustrates the overall working of the perceptron calculations:

0 푖푓 ∑ 푤푗푥푗 ≤ 푡ℎ푟푒푠ℎ표푙푑 푗 표푢푡푝푢푡 =

1 푖푓 ∑ 푤푗푥푗 ≥ 푡ℎ푟푒푠ℎ표푙푑 { 푗 To control the functioning of a perceptron there are two main adjustments that can be made (assuming the same input set) to achieve a different output. Firstly, the weightings of the inputs can be adjusted to make certain inputs more significant and others less significant; with the same input set these changes can result in a change in output (Goodfellow et al., 2016). Secondly, the threshold value can be changed so that the output is affected differently by the weighted input sum. For example, lowering the threshold value results in an output more favoured to 1 (true) and raising the threshold value results in an output more favoured to 0 (false).

52 A single perceptron is able to provide an answer to a very simple question. However, the true power of perceptrons lies in the use of them in a network made of multiple perceptrons where multiple layers exist. A layer of perceptrons in a neural network relates to all perceptrons at the same depth from the initial inputs. In Figure 10.1 the first column of perceptrons from the left would be considered the first layer, the second column the second layer and so on (Goodfellow et al., 2016). In a layered perceptron network, perceptrons of a particular layer make use of the outputs from perceptrons in the previous layer as their inputs. This allows for a more complex problem to be broken down into smaller parts which then each affect the end result. Adjusting the weights and thresholds of one layer can vastly affect the end output of the neural network (Goodfellow et al., 2016). Another concept known as bias was introduced to simplify the weighted input equation and make vector calculation possible. Bias can be seen as being a real value that indicates how easily a perceptron will output 1 or 0. By making use of vectors and bias, the equation for a perceptron becomes (Nielsen, 2015): 0 푖푓 푤 ⋅ 푥 + 푏 ≤ 0 표푢푡푝푢푡 = { 1 푖푓 푤 ⋅ 푥 + 푏 ≥ 1 This now makes use of the dot product of the weight and input vectors and the bias, 푏, simplifying the equation and making it easier to control how likely the neuron is to fire. Besides being used to weigh up input information to come to a conclusion, perceptrons can also be used to perform basic logical functions such as AND, OR and NAND. This can be achieved by altering the weights and threshold values of the perceptrons. Perceptrons can therefore be viewed as computational units that can be used in logical functions to perform base logical operations. As previously stated, it is also apparent that small adjustments to the weights and thresholds can cause completely different outputs. This is not desirable, as often with learning problems it is required that slight changes are made in the mentioned variable to induce a small change in the output (Goodfellow et al., 2016). For this reason a new type of artificial neuron was developed that could respond less extremely to small changes in weight and threshold. The newer artificial neuron was called a ‘sigmoid neuron’ and is described in the next section. 4.2.2 Sigmoid neurons To effectively learn, in the case of a neural network, it is important for an algorithm to exist that can slightly adjust input weights and bias on neurons in the network so as to fine-tune the network and teach it to better solve the given problem. These small changes to the weight and bias should not, however, result in dramatic changes in the output of the neuron. This is where perceptrons fell short, as they reacted too extremely to changes in the weight values.

53 Sigmoid neurons were created to solve this issue with perceptrons, and resultantly make for neural networks better adapted to learning. Just as with perceptrons, sigmoid neurons take in a number of inputs on which some computation is performed to create an output. Sigmoid neurons are similar in structure to perceptrons and, therefore, contain weights and a bias. There are some differences between the two types of artificial neuron, however (Aggarwal, 2018). The three main differences that sigmoid neurons have when compared to perceptrons are as follows (Goodfellow et al., 2016):

 Inputs are not restricted to only being 0 and 1, but can be real values between 0 and 1.  The calculation performed by a sigmoid neuron is different to that of a perceptron (this will be detailed shortly).  The output of a sigmoid neuron is not restricted to 0 and 1, but can be real values between 0 and 1.

The output from a sigmoid neuron is different to that of a perceptron, where a perceptron’s output is calculated using: 0 푖푓 푤 ⋅ 푥 + 푏 ≤ 0 표푢푡푝푢푡 = { 1 푖푓 푤 ⋅ 푥 + 푏 ≥ 1 A sigmoid neuron’s output is calculated as: 휎(푤 ⋅ 푥 + 푏) σ is called the sigmoid function and is defined as: 1 σ(z) = 1 + exp (− ∑ 푤푗푥푗푗 − 푏) This makes it apparent where the differences in the two artificial neurons lie, with the sigmoid function causing the sigmoid neuron to act very differently to the perceptron. The difference in behaviour is not fully apparent when dealing with weights, inputs and bias as 푧 = 푤 ⋅ 푥 + 푏 when z is a large positive number or a large negative number, as a large positive z value results in σ being approximately 1, and a very negative z value results in σ being approximately 0, which is very similar to how the perceptron would function. Only when dealing with a z value that is in the range between these is the difference apparent, with σ being a real number between 0 and 1 (Goodfellow et al., 2016). This can be visualised on a graph showing the shape of the sigmoid function, as seen in Figure 4.4.

54

Figure 4.4 Sigmoid function shape, showing how the z value affects the output value for the sigmoid neuron (Nielsen, 2015).

This is in contrast to the shape of the perceptron’s function, which is a step function, as seen in Figure 4.5. The smoothing of the perceptron’s step function into the sigmoid function is what allows sigmoid neurons to learn better in comparison to perceptrons. The result of the smoothing of the step function is that incremental changes to the weights and bias result in a small change in the output. This is conceptually represented in Figure 4.6.

55

Figure 4.5 Perceptron function shape; a step function (Nielsen, 2015).

Figure 4.6 The sigmoid function’s smoothing of the step function allows for small changes in weight to translate into small changes in the output (Nielsen, 2015).

56 One consideration that must be made when using sigmoid neurons is how to interpret the output results. For instance, take the problem of recognising a particular digit using a neuron, with a perceptron the output will either be 0 or 1, representing false or true, respectively. A perceptron therefore results in a definitive yes or no answer to the problem (whether the answer is correct or not) (Aggarwal, 2018). In the same example application, when making use of a sigmoid neuron, the output value of the sigmoid function could be any real number between 0 and 1. So how is the output result of 0.68 interpreted with regard to the problem of recognising the digit? With real numbered results a convention needs to be used that determines whether an output value should result in true or false effectively. This can be as simple as a convention that states an output of 0.6 or higher indicates the particular digit was detected and anything lower indicates that it wasn’t. With continued research and ever-evolving architectures, it can become very difficult to keep track of the various types of neural networks that are used in modern ML, how they differ from each other and what acronym is used to refer to each of them. The architectures span from the simple perceptrons (P), as discussed in section 4.2.1., to complex neural networks such as deep convolutional neural networks (DCNN). Van Veen & Leijenen (2019) have composed a collection of all the modern forms of neural networks and provided short explanations of each in their “Neural Network Zoo”, which is illustrated in Figure 4.7. 4.3 Deep learning Deep learning, as a field, has gone through many transformations and iterations over history to become what it is today. This includes going by various names during different time periods, which is why it appears to be a new field. However, it has been around since the 1940s, but has only been known by the name ‘deep learning’ for just over a decade, starting in 2006 (Goodfellow et al., 2016). Previously, the study of what is today deep learning was known as cybernetics, connectionism as well as neural networks. However, the focus of the field has always been the same: to understand how human intelligence works and apply these principles to machines so as to create programs capable of solving problems that both humans and current AI are unable to (Nielsen, 2015). Deep learning, however, takes this a bit further, focusing not so much on the neuroscience that previous iterations did, but rather on the use of multiple levels of composition for learning. These deep learning techniques may even be applied in frameworks that do not draw inspiration from neuroscience. The resurgence in the popularity of deep learning came due to the work of Geoffrey Hinton, who, in 2006, made use of a deep belief network (a type of neural network) that allowed for improved generalisation through a greedy layer-wise pretraining approach for unsupervised learning. This led to researchers being able to make use of deeper neural networks than was possible before and ultimately led to the name ‘deep learning’ becoming popular.

57

Figure 4.7 The Neural Network Zoo, representing the progression and various architectures of modern neural networks (Van Veen & Leijnen, 2019).

58

There were a few factors that contributed to the late resurgence of deep learning in the mid-late 2000s, which include:

 increasing dataset sizes;  increasing model sizes; and  increasing accuracy and impact.

To understand how they contributed to deep learning as we know it today, each will be discussed below. 4.3.1 Increasing dataset size One of the most important aspects of building a successful ML algorithm is the dataset used to train the model. Without the input training data, the algorithm cannot learn how to translate inputs into outputs and ultimately perform its given task (Nielsen, 2015). When ML and work with deep learning was being conducted in the 1950s, experts of the field were required to hand craft datasets for the algorithms to makes use of. At the time, this was an art in itself and could take months or even years to complete. In later decades, however, with computer systems being used world-wide and digital information creation and storage having become more popular, it became easier for researchers to collect information in a standardised manner and manipulate it as needed (Foote, 2019; Goodfellow et al., 2016). In doing so, the time needed to create training datasets was drastically cut down, and over the years became easier and easier as the proliferation of computer systems and the internet continued to increase. Standard benchmark datasets were developed over the decades and have grown in size, quality and subject, making it easy for researchers to now build ML and deep learning models leveraging these existing well-known datasets (Nielsen, 2015). The increase in available training data was a big factor. However, the evolution of learning models also played a big role. The increase in the size of learning models and what effect they had is discussed next. 4.3.2 Increasing model size Comparable to the limited availability of datasets in the early days of ML, was the lack of computational resources to actually make use of to carry out the training of the models and processing of the datasets (Nielsen, 2015; Heller, 2020). As computational power, network speeds and storage capability grew in computer systems, it allowed for neural networks to be made that were more complex than before, contained more neurons and had greater numbers of connections between neurons. In studying animal and human brains and their associated intelligence, it is apparent that the perceived intelligence of an entity relates strongly to the number of neurons that its brain has and how well connected the neurons are (Goodfellow et al., 2016).

59 Until the 2000s neural networks still had a relatively small number of neurons due to the hardware that was needed to effectively run such programs. But with increasing CPU power and the availability of GPUs for general purpose use, the number of neurons that a neural network could effectively have exploded in size, allowing for more complex and intelligent models to be made (Foote, 2019; Nielsen, 2015). The last contributing factor to the rise in popularity of deep learning methods in recent times relates to the improved accuracy of models, and their use in a wider range of real- world applications. This is explained in the following section. 4.3.3 Increased accuracy and impact As research in the various fields of ML improved and gained more attention, so did deep learning, with researchers improving the ability of models to handle greater volumes and higher quality data. Learning techniques also began to be applied to more and more problems in the real world that held real value to businesses and individuals alike (Foote, 2019; Nielsen, 2015). With successful applications in image recognition and speech recognition, more effort was put into improving these methods even further, leading to algorithms capable of performing complex recognition or translation tasks. At the same time, technology companies began to invest in deep learning to build products and services that they could offer, or to improve upon what they already had. This includes companies like Google, Facebook, Amazon, Apple, Microsoft and many more (Goodfellow et al., 2016; Heller, 2020). Due to the increased popularity, software suites were created to make the task of designing, building and testing learning algorithms, improving the accessibility of the technology and causing greater uptake in use. Well-known libraries include TensorFlow, PyLearn2, MXNet and more (Heller, 2020). The evolution of the field of deep learning over the past 60 years has been extraordinary, being aided by breakthroughs in science, statistics, hardware manufacturing and networking to become one of the most in-demand technologies in the business and academic world. The applications are endless and can help greatly to improve our understanding of the world, simplify difficult tasks, and ultimately change the manner in which so many research fields operate. 4.4 Extreme learning machines In section 4.1, machine learning was discussed in detail, describing where it began, what types of ML approaches exist and how they are used in the modern world. Further attention was given to neural networks, a specific type of ML approach which was created in an attempt to mimic how the human brain learns (or at least how it is thought to learn).

As with any popular technology, advancements are constantly occurring, and neural networks are no exception. Comparing the early NN architectures such as perceptrons to more recent architectures such as convolutional neural networks (CNN) or feed-forward neural networks (FNN), a lot has changed and improved to enhance the learning capabilities

60 of NNs. The neural network zoo (Van Veen & Leijnen, 2019), as discussed in section 4.3, provides a good broad coverage of known NN architecture types.

The different types of NN architectures were created with the intent of improving learning capabilities, improving application to a problem domain, or accommodating a new theory of how the human brain is structured to learn. One of the more recent NN architectures, which has received a lot of attention and enjoyed success in the tasks of classification, regression, feature learning, clustering, compression and sparse approximation, is a special type of FFNN known as ‘extreme learning machine (ELM)’ (Huang, Zhu & Siew, 2006).

Before covering further detail regarding ELM, its architecture and its applications, it is important to understand FNNs. This will highlight the differences with ELM and FNN.

Figure 4.8 General architecture of feed-forward neural networks (Gupta, 2017).

The remainder of chapter 4 will first discuss FNNs before going on to detail ELM. Thereafter, the chapter will conclude by discussing how relevant ELM is to the research questions stated in Chapter 1 of the thesis. 4.4.1 Feed-forward neural network Feed-forward neural networks are a type of NN in which there are layers of ‘hidden’ nodes in between the input and output layers. Hidden layers are only connected to other hidden layers, input layers or output layers, having no direct connection to the outside world, hence the name ‘hidden’.

In FFNNs all the nodes in a given layer are fully connected to all nodes in the next layer, constantly feeding information forward in the NN, hence the name. They are a supervised NN in which pairs of inputs and outputs are provided to the NN in order for it to learn how

61 the inputs and outputs are related (Abhigna et al., 2017). Figure 4.8 depicts the basic architecture of FFNNs, indicating input, hidden and output layers.

FNNs are one of the most popular NNs in use, with applications in classification, clustering, regression and more. This type of NN is often viewed as one of the simplest in existence and is often regarded as the first type of NN created. Over the years, adjustments to the type of nodes, number of layers, activation functions, and learning techniques employed with FFNNs have kept them relevant and of practical use to solve problems in the real world (Kumar & Suresh, 2016).

Figure 4.9 Single-hidden layer feed-forward neural network (Kumar & Suresh, 2016).

62

Figure 4.10 Multiple-hidden layer feed-forward neural network (Ertuğrul, Tekin & Kaya, 2017).

When considering FFNNs, they can generally be divided into two classes (Ertuğrul, Tekin & Kaya, 2017):  single-hidden layer feed-forward neural networks (SLFN), and  multiple-hidden layer feed-forward neural networks (MLFN).

The main difference between the two classes of FFNN is simply the number of hidden layers, with a single hidden layer in SLFNs and multiple hidden layers in MLFNs, as compared in Figure 4.9 and Figure 4.10. The impact of the number of hidden layers is dependent on the type of nodes they consist of and the activation functions, but can be simplified as: MLFNs allow NNs to handle larger data sets and deal with discontinuity better, the addition of extra hidden layers does, however, slow down the training process and may cause the NN to settle on local minima (Benny & Soumya, 2015).

The learning algorithm generally employed by FFNNs is known as back-propagation. In this learning algorithm, the error of the network is calculated by comparing the expected output to the predicted value, which is then propagated back into the network one layer at a time. As the error is back-propagated, the weights for each node are updated in accordance with their contribution to the error (Gupta, 2017).

The back-propagation process is then repeated for each of the training set examples provided, which, once completed, is referred to as an epoch. A FFNN may be trained for

63 tens or even thousands of epochs, with the goal of reducing overall error for the network as much as possible (Kumar, 2019).

It is important to discern the difference between back-propagation and feed-back loops within a NN. Back-propagation is simply a method used in training an NN, whereby the weightings of nodes in the network are updated to improve performance. In predictive use, data still flows in a single direction: forward (Kumara, 2019).

Feed-back loops in an NN are an architectural difference in data use in the network, whereby nodes in a given layer may feed their resultant output back to themselves as well as forward. This results in the nodes receiving input from the previous layer for the current time step as well as from themselves from the previous time step. This type of architecture is seen in all different types of recursive (recurrent) neural networks (RNNs) (Van Veen & Leijnen, 2019).

FFNNs have been shown to be powerful yet simple neural networks which can be applied to many common predictive problems in the modern world. They do, however, possess shortcomings, and this led to the development of a new class of SLFN, the extreme learning machine.

The following section discusses EML in further detail, highlighting how it differs from regular SLFNs and what advantages it provides.

4.4.2 Extreme learning machine FFNNs are versatile supervised learning networks capable of solving an array of problems. Their simplicity is one of their strengths, relying on activation functions and learning techniques to optimise their output instead of complex architecture.

Single-hidden layer feed-forward neural networks are a class of FFNN where only a single hidden layer exists between the input and output layers. This is a commonly used architecture, as it has been shown that simply increasing the number of nodes in the layer boosts performance over the addition of more hidden layers, and that additional layers should only be added if the dataset is very large or if the error remains too high (Svozil, Kvasnicka & Pospichal, 1997).

When building SLFNs the parameters in the FFNN need to be tuned, including weights and biases, for the purpose of the gradient-descent-based learning approach they employ. Gradient descent approaches have, however, been shown to be much slower and have a tendency to converge on local minima. To improve on these outcomes, large numbers of training iterations may be required, which also slows down the development of the end solution (Benny & Soumya, 2015).

Huang, Zhu and Siew (2006) discovered that by making use of randomly chosen input weights and biases, SLFNs could be produced that performed well in learning observations from data, and did so much faster than previous SLFNs could. They named this new type of SLFN ‘extreme learning machine’ due to its extreme speed, ability to minimise training error and good generalisation performance.

64 ELM make use of the fact that as long as the activation functions used in hidden layers are infinitely differentiable, the input weights and hidden layer biases can be assigned randomly. The resultant learning algorithm outperforms the traditional back-propagation method used in the majority of FFNNs in both training time and minimising error (Sönmez et al., 2018).

For an SLFN employing ELM, a network in which the hidden layer contains N hidden nodes is capable of learning N distinct observations exactly. The result is that ELM implementations may require more hidden nodes than networks making use of back-propagation (Zhang, Zhao & Wang, 2015).

ELM is a relatively new learning approach in the world of neural networks, with further research being carried out yearly. Huang and Chen (2008) created an incremental version of ELM (I-ELM) in which random nodes were added to the hidden layer incrementally, locking the weights of existing nodes when new nodes are added. This allows for greater flexibility in creating SLFNs in which the architecture is not predefined as in ELM and helps to minimise error.

Zhang, Zhao and Wang (2015) proposed a new variant of ELM in which normal equations are used to reduce the size of SVD matrices used by the learning algorithm. The results, D- ELM and CG-ELM, are able to handle much larger data sets than ELM due to the reduction of matrix size.

Li and Zhang (2016) created fast sequential ELM (FS-ELM) in order to improve on the speed performance of existing sequential ELM implementations and online sequential ELM. This is achieved by superimposing part of the training data as an orthogonal matrix while calculating the random weights and biases.

Further extensions have been made to ELM in recent years, such as incorporating swarm modelling into the randomisation of weights (DECABC-ELM), extending ELM for use in MLFN (X-ELM), and introducing a weighted version of ELM in which instance distribution is focused on (WELM) (Jiaramaneepinit & Nuthong, 2018; Mao, Li & Moa, 2018; Yu et al., 2019).

4.5 Conclusion In Chapter 4, the field of machine learning was discussed, providing a brief background and history of the field and how it evolved. The process of machine learning was defined; the components of an ML problem were described; and different types of ML algorithm tasks were covered, along with examples of their application.

Next, deep learning, a sub-field of ML, was covered. The history of deep learning was presented and the main contributing factors to its resurgence were listed and discussed, showing that a combination of technological and research improvements led to its current ability and popularity.

Lastly, a more detailed description of a type of neural network, feed-forward neural network, was given. The architecture of FFNNs was detailed, along with the learning

65 techniques employed and the classes of FFNN: single-hidden layer feed-forward neural network (SLFN) and multiple-hidden layer feed-forward neural network.

The difference between SLFN and MLFN was discussed, along with their strengths and considerations that need to be made when choosing either. A specialised version of SLFN designed to be more performant while still maintaining accuracy, the extreme learning machine, was detailed. The ELM learning algorithm was contrasted with the traditional FFNN back-propagation learning algorithm.

Further developments regarding ELM were discussed, citing particular discoveries made in recent years.

The information presented in this chapter is of relevance to the following secondary research question:

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment? SRQ5: Can ML, in particular ELM, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

The chapter has clearly demonstrated the ability and adaptability of modern machine learning techniques. With sufficient datasets for training that contain sufficient features, learning algorithms are capable of performing a wide array of tasks that are extremely difficult for humans to carry out, such as optimisation and predictions. Learning algorithms such as deep learning are also becoming more proficient at tasks that humans find easy but that are difficult to define rules for, such as object recognition. ML approaches can be used to predict the evolutionary change for a digital environment based on existing historical data relating to the environments components and actors. ML approaches are capable of predicting possible future events through the use of statistical models that find patterns in data which are normally difficult or impossible for humans to identify. The greater the quality and quantity of data used by the ML approach the more accurate it can be in predicting a future outcome. SRQ4 is therefore addressed sufficiently with regard to machine learning. FFNN architectures have been shown to be versatile and accurate methods of supervising an ML implementation to predict future values. FFNNs do, however, suffer from slow training times and problems finding global minima in solution spaces. ELM was created to address the problems, creating an effective and efficient NN. Applying ELM to predicting changes in a digital environment is therefore intuitive due to its large improvement in performance over traditional FFNNs. Further detail regarding the use of ML to predict and drive change in a digital environment will be covered in chapters 10 to 12. This chapter does show the viability of using ELM as the ML approach over older NN architectures, addressing SRQ5.

66

67 5 Programming paradigms in software engineering In the field of Software Engineering, many different paradigms have been developed over time, fit for addressing problems of a specific type or for realising solutions for performance or scale. These learnings have led to design principles that are seen as common practice for most situations. However, in particular scenarios an approach that breaks ‘the rules’ is required.

Chapter 5 will take a look at programming paradigms, exploring what a paradigm consists of, describing some of the main paradigms used in modern software engineering, describing their strengths and uses, and understanding the scenarios that each could be the best fit for.

Firstly, programming paradigms will be defined along with their relationship to programming languages.

5.1 Programming paradigm When considering developing software to solve a given problem, often the programming language used is one of the first details decided upon. Different programming languages have different strengths and intricacies that make them more or less suited for certain types of problems. However, what is not usually considered correctly is that it is programming concepts that are the important part of solving a problem, and programming languages are ultimately defined by the concepts they make use of or allow (Abelson & Sussman, 1985).

A paradigm refers to a way of doing something, but is not an indication of a concrete thing. In relation to programming then, a programming paradigm is a way of programming and not a language itself (Gabrielli & Martini, 2010). A paradigm is defined by the concepts that it makes use of, each of which addresses a certain aspect of solving a problem through code, whether it be the structure of entities, how different entities relate to each other, or how data is represented, managed and made use of (Krishnamurthi & Fisler, 2019; Van Roy, 2009). Based on the concepts that form a given paradigm, the paradigm may be considered the best option for solving a specific problem.

68

Figure 5.1 Relationship between programming languages, paradigms and concepts (Van Roy, 2009).

The relationship between paradigm and language is based on the need for a language to implement the programmatic solving of a problem, and therefore must make use of a number of paradigms (or just one) in doing so. When considering paradigms, it is important to understand that they are defined by the concepts they encapsulate (Van Roy, 2009). Figure 5.1 above illustrates the relationship between programming languages, paradigms and concepts, making it clear that, in the end, programming languages are the realisation of a set of concepts, shaping what the language is capable of and best suited for.

Most programming languages have been built to accommodate more than one paradigm (generally two, sometimes three, but rarely more), which gives them greater flexibility in use as varying programming problems require the use of different concepts to solve. Therefore, making a language capable of supporting more than a single paradigm makes sense. Every programming language, however, will have a main paradigm that it is built around, and this is evident when looking at the mainstream languages in wide use in industry (Van Roy, 2009). Ideally these languages should be designed to support even more paradigms, as the more concepts that can be incorporated into a language, the more different programming problems can be addressed without the need to make use of another language. However, there could be more problems in maintaining these multi- paradigm code bases compared to single paradigm code bases (Krishnamurthi & Fisler, 2019; Tucker & Noonan, 2007).

69

Figure 5.2 A taxonomy of programming paradigms, grouped by key properties, and showing concepts that differ from one to another (Van Roy, 2009). 70

Programming paradigms are defined by the concepts they incorporate, and due to this, many of the different paradigm types are not too far removed from one another, with the difference sometimes being the addition or removal of a single concept. Figure 5.2 depicts a taxonomy of programming paradigms, illustrating the concepts that are added to make other paradigms and grouping them based on properties that they have. This by no means covers paradigms, concepts or languages in an exhaustive or comprehensive manner; it merely serves to give a broad overview of the paradigm landscape and illustrate how paradigms that on the surface appear to be extremely different, are only one concept apart (Tucker & Noonan, 2007). Often within a family of paradigms the only difference is the addition of a new concept that was included to address a particular problem, and it is ultimately the concepts that define the paradigm. However, there are key properties of paradigms that place them apart from others. The two key properties are (Abelson & Sussman, 1985):

 to what level they support state; and  whether or not they possess observable nondeterminism.

These two properties greatly affect the type of problems the paradigm is suited for and, by deduction, the languages that can be used to program a solution. Below, each property is described individually, covering what implications they have and what considerations then need to be made.

5.1.1 Observable nondeterminism When considering the determinism of a programming paradigm, this refers to whether or not the execution of a program is determined fully by its specification. Programs that, based on the specification, are not allowed to alter the execution through some type of choice are referred to as being deterministic. If the program, however, has the ability to choose what to do next during execution, through the part of the run-time system called the scheduler, then the program is referred to as being nondeterministic (Van Roy, 2009; Tucker & Noonan, 2007).

The observability of the determinism/nondeterminism is dependent on whether or not the end user is able to see that the execution of the program gives different results even though its configuration is the same, which is not desirable at all (Gabrielli & Martini, 2010).

It is therefore recommended that observable nondeterminism only be supported if it is explicitly needed for its expressive abilities.

71

5.1.2 State The ability of a paradigm to support state is the other key property that programming paradigms need and, depending on the level to which state is supported, indicates how expressive the paradigm can be (Gabrielli & Martini, 2010).

Maintaining state refers to whether sequences of values (or information, if considering this at a high level) can be stored in time. Paradigms that allow for maintaining state give their languages the ability to store information and manipulate it or reuse it at a later point in time (Abelson & Sussman, 1985; Krishnamurthi & Fisler, 2019).

Measuring the expressiveness of a paradigm can be done using three axes, each of which relates to a different aspect of expressiveness: is the state named or unnamed, is it sequential or concurrent, and is it deterministic or nondeterministic (Van Roy, 2009)? Depending on the combinations of these expressive aspects, different uses for the representative paradigms are available. Figure 5.3. illustrates some of the main combinations of the expressivity aspects and the given paradigm families they represent, arranging them from least to most expressive vertically.

Figure 5.3 Levels of support for state with main paradigm families arranged by expressiveness (Van Roy, 2009).

It is apparent from the observations of programming languages, paradigms and concepts, that the most important part of the whole equation is the programming concepts themselves. What are these concepts and how do they change the manner in which programming problems can be tackled? When choosing a paradigm to make use of, it is important to pick one that has the right number of concepts, as having too many concepts makes reasoning complicated, while having too few causes the programs to become complicated (Tucker & Noonan, 2007).

72

The following section will examine programming concepts more closely, describing what they are, what the most important concepts are and how they affect problem solving and programming.

5.2 Programming concepts Programming concepts, as stated in the previous section, are the most important part of the programming language/paradigm/concept landscape. Concepts are the building blocks of paradigms and languages, and determine what these are capable of.

There are numerous programming concepts that, when combined in different combinations, result in the paradigms and languages that implement them being able to solve various types of problems (Gabrielli & Martini, 2010; Krishnamurthi & Fisler, 2019). Some programming concept combinations can be extremely useful, while others are not well suited to each other. The creation of powerful paradigms is dependent on making use of concepts that are not only useful individually but complement each other.

Even though there are many different concepts that can be made use of, there are four main concepts that ultimately determine the utility of the programming paradigm and languages (Van Roy, 2009). The four main concepts are:

 independence,  records,  lexically scoped closures, and  named state.

Each of the above will be defined in further detail below, covering what the concept is, what implications it has and what problems they are best suited for.

The first concept that will be discussed is independence.

5.2.1 Independence The concept of independence relates to the ability to design and build a program in such a way that there are a number of independent parts to it (Gabrielli & Martini, 2010). Each of the independent parts of the program can then execute individually and without necessarily waiting for other parts to finish first.

This concept, also known as concurrency, is the opposite of sequential execution, where various instructions are executed in sequence to achieve a result and must wait for the previous instruction to finish before starting (Abelson & Sussman, 1985). Ultimately, sequential programs are ordered in time, whereas concurrent programs are, not and the various parts of the program do not interact; the parts are independent of one another.

73

Often the concepts of parallelism and concurrency are confused. However, they are not the same (Tucker & Noonan, 2007):

 With parallelism, parts of a program are run simultaneously by making use of multiple physical processors. Parallelism is therefore seen as a hardware concept.  With concurrency, parts of a program can run independent of one another without following a given order. However, these parts do not run at the same time. Concurrency is a software concept.

When considering concurrency in the computing world, there are three levels of concurrency that occur (Gabrielli & Martini, 2010):

 distributed system,  operating system, and  activities inside one process.

With distributed systems, consider each computer to be its own independent part of a program, able to execute in any order in relation to other computers (parts of the program) in a computing network. This is a commonly found setup, with all networks conforming to this, meaning that the Internet as a whole is structured like this too. Each computer is able to act independently of the other computers in the network. If this were not the case, one can imagine how complicated the scheduling in large networks would be, and how much time would be wasted (Van Roy, 2009).

Operating system concurrency comes in the form of processes that the operating system manages. Each process is independent of other processes and is normally related to a specific application that is running on the operating system at that point in time. The main area of independence for processes is their memory; each process has its own memory. The operating system is responsible for scheduling the execution of processes and the allocation and use of their memory; even though the physical memory of the computer is shared, the logical memory assignment to processes is independent as they do not share this with other processes (Krishnamurthi & Fisler, 2019; Tucker & Noonan, 2007).

Within a process there is another level of concurrency at play: the concurrency of various activities within a single given process. At this next level down, each process can have multiple activities that need to be executed. These activities are independent of each other in their execution and are generally referred to as threads. Threads have shared memory, as they make use of the same memory space as one another due to the memory allocation being for the process (Gabrielli & Martini, 2010). Therefore, on a thread level, concurrency is purely in terms of execution. The main differentiator between processes and threads is how

74 the resource allocation in the system is carried out: processes make use of a type of concurrency known as competitive concurrency, whereas threads make use of cooperative concurrency (Abelson & Sussman, 1985). The difference between the two is that processes compete with one another for resources to execute their application and will try to consume as much of the available resources as possible, while threads collaborate with each other, as they form part of the same application, and therefore share resources. Due to the greedy nature of processes, it is the operating system’s job to allocate resources in a balanced and fair manner so that all applications that are running get a chance to execute instructions.

When it comes to considering paradigms based on concurrency, there are two main types used (Tucker & Noonan, 2007):

 Shared-state concurrency: in this form of concurrency, control structures, called monitors, are used to manage concurrent access by threads to shared memory or data.  Message-passing concurrency: in this form of concurrency messages are sent between multiple agents which all run on the same thread. The messaging can be handled synchronously or asynchronously.

Shared-state concurrency paradigms are some of the most popular and widely-used paradigms, and are implemented in widely used languages such as C# and Java. Message- passing concurrency paradigms are not nearly as mainstream, but are implemented in languages such as Erlang, which are of high value in specific industries.

Concurrency can be applied to computers, programs, processes or threads, specifying the independence each of these has in executing separately from others of its kind and what memory or data can be accessed. The data that any program has access to is very important to its execution, and the ability to store the data in a form that can be easily manipulated makes solving programming problems so much easier. Records provide the ability to store data, and this vital concept is explained in the next section.

5.2.2 Records To manage data within programs a construct is required in which references to each data item needs to be maintained, allowing access to each of the data items in an indexed manner. The construct mentioned here is known as a data structure, and it comes in many forms including arrays, lists, strings, etc. (Gabrielli & Martini, 2010).

Each of these data structures is created from the foundation building block known as a record. Records are a simple type of data structure that attributes data to an entry within a structure, where the record may have multiple different data items as part of it, which are referred to as members (Klein, 2016).

75

As mentioned above, data structures such as arrays, lists and even strings are the result of manipulating records in a structure, where each atomic part of the data structure is actually a record (i.e. each item in an array, or each character in a string).

The use of records and the subsequent data structures that are built from them have given programs the ability to solve problems that were difficult to tackle previously, creating simple ways to manipulate data and store it for later use in the execution of the program (Klein, 2016).

Almost all paradigms make use of the concept of records in some way or form. However, what really differentiates paradigms from a record point of view is the manner in which records are handled, what can be done with them and what other concepts they can be combined with (Krishnamurthi & Fisler, 2019; Van Roy, 2009).

A very important concept that, when combined with records, allows for programming in a component-based manner is that of lexically scoped closures and is discussed next.

5.2.3 Lexically scoped closures Closures are a very important concept in the world of programming, combining procedures with their external references (Gabrielli & Martini, 2010). Closures allow for the packaging of computational work at any point in the program so that it can then be transferred to another point and be executed there as if it had been executed where it was initially created.

This is an extremely powerful concept that, when combined with other concepts such as records (as mentioned in the previous section), allows for important capabilities for the programming language such as instantiation and genericity, as well as component-based programming.

Most modern programming languages make use of closures that can be assigned to a named state and used at a later point of execution in the program. This includes using closures in implementing the following constructs (Krishnamurthi & Fisler, 2019):

 procedures,  functions,  objects,  classes, and  software components.

For example, as mentioned previously, closures can be used to implement instantiation and genericity, which, in relation to object-oriented programming, allows for the definition of the closure to create a class, and using reference to the closure at a later point to create an instance of the class or an object. Many programming languages make use of closures in this

76 manner in order to achieve functionality that is central to the operation of the language and programs built with it. However, most modern languages hide the intricacies of closures from the end use (Van Roy, 2009).

Component-based programming, which was mentioned in the records concept section, is a type of programming in which a program is broken down into smaller units, known as components. Each component can be seen as being a function, which is instantiated through the use of a record and is referred to as a module. Some modules may be dependent on others, and a new module can be created by passing its dependent modules as parameters to a function.

Following on from the concepts of closures and records, it has been shown that it is important in programming languages to be able to store data inside a program that will be used in its execution and can ultimately change the results of the execution. This is abstracted into the concept of named state (which was briefly touched on earlier in this chapter), which is defined in the next section.

5.2.4 Named state Previously when discussing named state, it was described as a property that caused the execution of a program not to have the same result every time it is run. This is true on a high level but it was not explained why this occurs (Gabrielli & Martini, 2010).

When considering a program that performs mathematical functions in sequence, every time you run the program you will get the same result (assuming you have the same starting point or value). However, these types of programs do not need to keep track of the program and how it changes over time (Van Roy, 2009). State is an abstract notion of time that keeps track of a sequence of values in time using one name, this means that systems have many inputs and data that change as execution of the program occurs. At the same time other factors change too (clock cycles and random numbers) (Klein, 2016).

Named states are a way in which to keep track of the state of a program at any given point in time, and the change in the named state of a program alters the behaviour of the program as it progresses further in its execution (Klein, 2016). This relates to the concept of records, as the named state of a program needs to be stored in a record of some form so that it can be accessed and manipulated.

The use of the concept of named state allows for another important property for programs: modularity, whereby parts of a program or system can be changed without needing to change all of the other parts (Gabrielli & Martini, 2010; Krishnamurthi & Fisler, 2019). Without named state, modularity isn’t achievable because individual modules would not be able to change their behaviour, and thus all other modules would need to be altered if a change was needed.

77

Modularity is a big advantage that comes with using named state. However, there are disadvantages that can be introduced if named states are not used correctly. The key to avoiding this is to not give all parts of a program named states, but rather limit them to specific parts, making it easier to define code sections that operate in a purely functional manner: only taking an input state and resulting in an output state (Klein, 2016). This configuration is known as a state transformer, and is shown in Figure 5.4 below.

Figure 5.4 The state transformer configuration for a program (Van Roy, 2009).

The four main concepts covered in the previous sections lay the foundation for the majority of the main programming paradigms in use today. Some of these paradigm families cover an array of paradigms that modern software engineers use on a daily basis. But what is it that differentiates each of the paradigm families from each other? How does it affect the capabilities they provide? And which languages implement them? The following section will explore some of the main paradigm families, describing notable paradigms in use in modern programming.

5.3 Data abstraction The ability to store and manipulate data within a program is a key need for solving programming problems, allowing for state to be maintained and for complex operations to be performed (Liskov, 1987). The way in which data is made use of is, therefore, a very interesting topic and shapes the constructs that can be used in the programming language, as well as the style of programming to be used. Data abstractions allow the programmer control over the data in their applications and provide flexibility in how the data can be used (Gabrielli & Martini, 2010).

Data abstractions are used for the purpose of controlling how data is accessed, what operations are permissible on the data, and simplifying the programs that they are used in. A data abstraction is defined as being a way of organising the use of data structures so as to guarantee correct use of the data structures, by making use of a set of definitive rules (Krishnamurthi & Fisler, 2019; Liskov, 1987).

78

When considering the structure of a data abstraction there are three main components that are made use of:

 The inside: this is where the data structures are stored.  The outside: this is exposed to the end user, and the inside is hidden from it.  An interface: used to control operations on the data, and interfaces between the inside and outside.

Figure 5.5 shows the components of a data abstraction and how they are related.

Figure 5.5 Structure of a data abstraction (Van Roy, 2009).

The advantages of constructing data abstractions in this manner are many, but the main three are that operations on the data structures will always work correctly due to the interface defining exactly what operations can be carried out on the data; large programs become easier to develop and maintain, as different programmers can be made responsible for different parts of the program and only need to be concerned with maintaining their part and the interfaces it makes use of (Van Roy, 2009); and lastly, it simplifies the program, as it allows for the code to be organised into multiple abstractions and interfaces, which can be implemented independently and doesn’t require that the users understand how the abstraction works, as they only make use of the interfaces.

Data abstractions can come in many forms, and, depending on how they are designed, can result in very different behaviours. There are two main axes along which data abstractions can be placed, which determines certain properties of the abstracts. The resultant combinations define very specific data abstractions that are implemented in particular programming languages to achieve a task (Liskov, 1987). The two axes are:

79

 State: whether the data abstraction makes use of named state or not (stateful vs stateless).  Bundling: within the abstraction, are data and operations combined into one entity or are they separate? When bundled together the result is known as an object or procedural data abstraction (PDA). When kept separate the result is known as an abstract data type (ADT).

Figure 5.6 illustrates the plotting of data abstraction on these two axes and what the resultant entities are, two of which are common in many modern mainstream languages. ADTs without named state (i.e. Integers in C# or Java), and objects with named state (objects in languages such as C# or Java).

Figure 5.6 Data abstractions plotted against state and bundling (Van Roy, 2009).

ADTs with state and stateless objects, also known as declarative objects, have their uses in other programming paradigms but are more specialised for particular styles of programming where they can achieve results that the other data abstractions cannot.

Besides the types of data abstractions that exist based on the axes on which they align with regard to state and bundling, there are two other very important principles related to data abstraction that are widely used in programming paradigms and languages (Van Roy, 2009):

 inheritance, and  polymorphism.

The use of these principles can create programming environments that are capable of highly complex configurations to map a problem to a solution in code. Both inheritance and

80 polymorphism are generally strongly associated with object-oriented programming, but they are used in other paradigms with great success. Object-oriented programming is just so strongly supported by these principles that it is often the first association that comes to mind.

In the next section, inheritance will be described, defining what this principle is and what it allows programming languages to do, and illustrating why it is so powerful.

5.3.1 Inheritance and composition In programming, the repetition of code is a great source of errors and a drain on time because if changes are needed, they must be made in all locations of the repeated code. Data abstractions can suffer from this same generic problem, especially because most data abstractions share many common properties (Liskov, 1987).

To solve this problem, it would be ideal to be able to define code, objects and data abstractions in an incremental manner, where the parts common to each can be inherited from an existing entity instead of recreating it. Inheritance implements this through allowing definitions of entities to inherit properties from other existing definitions, where the incremental abstract definition is known as a class.

Inheritance has a major draw-back, however, which can further add to the problem it is designed to solve, as by extending one definition with another, an additional interface is created that also needs to be maintained. If inheritance is not limited, it can complicate a program and increase the chance of introducing new errors with each addition or change of code.

It is therefore recommended that one makes use of inheritance as little as possible when designing a program, and if it is required, to make use of the substitution principle when using it. The substitution principle states that when creating a class that inherits from another class, any procedure that works on objects of the inherited class must also work on objects of the inheritee class. This boils down to a simple statement: inheritance should not break anything (Van Roy, 2009).

If inheritance is to be limited as much as possible, then what other option is available to reduce code repetition and create modular, easy-to-maintain code? The answer lies in another principle that is often mentioned together with inheritance: composition, which gives us another option to create relationships between different classes and objects.

Composition favours a method in which an attribute of an object refers to another object, resulting in the first object is composed with the second object. A single object can be composed of multiple other objects, and because no inheritance takes place, no surplus interfaces are created that need to be managed (Liskov, 1987).

81

With composition, each object’s class can be maintained and changed independently of each other, as any procedures on an object that composes another will be handled by the root object’s interfaces. This is a much more desirable situation when it comes to maintaining code and reducing repetition. Figure 5.7 illustrates the high-level difference between inheritance and composition.

Figure 5.7 Inheritance and composition from a high-level principle view (Van Roy, 2009).

The second important principle of data abstraction, polymorphism, also allows for simpler code implementation for a program and a reduction in code and repetition. Polymorphism is discussed next.

5.3.2 Polymorphism To effectively reduce errors and the repetition of similar functions which only differ based on argument types that they can handle, the principle of polymorphism must be applied. In nature, polymorphism is the ability of an entity or organism to take on different forms. Polymorphism in computer programs is analogous to this but is focussed on the area of arguments to procedures (Liskov, 1987).

For a program to be considered polymorphic, it must be able to work with not only one type of data abstraction, but rather with any data abstraction so long as the interface matches that of the data abstraction the program was designed to make use of (Van Roy, 2009).

A simple example of this would be a function that performs addition. In the non- polymorphic sense, this function would strictly take in two arguments of a particular type (integers for example) and perform an addition of the two using a definition of how integer addition is done. If the programmer now wanted to implement floating number addition,

82 they would need to create a new function that takes in floats instead, but would ultimately be repeating code.

The polymorphic version of this function, however, would be able to take in any type that is supported by the language, creating a single function that can be used to do addition of multiple types. The function itself doesn’t necessarily know how to perform addition for each but calls the defined operators for the type of classes passed to it.

The strong synergy between polymorphism, named state and inheritance is the foundation of the object-oriented programming paradigm, which is widely used in many of the mainstream languages. The next section will describe the object-oriented programming paradigm, its uses and its strengths.

5.4 Object-oriented programming Object-oriented programming (OOP) is used extensively around the world in teaching programming concepts and design, as well as in practice, when building applications for business (Bugayenko, 2017).

As made apparent in the name of the paradigm, OOP is based on the use of objects which contain data and code. Data within an object comes in the form of attributes or properties, while code is in the form of functions or methods (Wisskirchen, 1996). Object-orientation allows for the creation of programs in which objects are created that are capable of storing state (in the form of data attributes or other objects) and performing operations on the data.

In OOP, objects often allow for manipulation of their attributes through the exposure of methods that perform the updates or changes and not by allowing direct access to the attributes (Wisskirchen, 1996). Due to this level of control over objects, OOP applications are used in cases where it is necessary to view, create, update and delete data, along with presenting this to users (Bugayenko, 2017).

OOP languages can be typified as having the following main principles (Bugayenko, 2017):

 abstraction,  encapsulation,  polymorphism, and  inheritance.

Encapsulation is the application of data hiding through the use of classes, where, as mentioned above, other objects cannot directly access or alter data attributes, but must rather make use of publicly listed functions that can perform actions on the attributes (Martin, 2009). This adds a good level of security to the program and protects against unintended data corruption or changes through the course of an application’s execution.

83

Abstraction was also mentioned previously and helps programmers to maintain code more easily, while also keeping details around implementation of functions in an object hidden from outside objects (Martin, 2009). This is closely related to the encapsulation principle.

Polymorphism allows for objects to be more flexible in taking in argument types for procedures, reducing duplicate code and abstracting the function of specific operations from the parent function that calls them (Bugayenko, 2017).

Inheritance allows for the creation of relation hierarchies between classes, reducing duplication of common code and shortening development time (Wisskirchen, 1996). This consists of both inheritance and composition, where each is suited for particular scenarios.

Object-orientation as a paradigm enable languages that implement it to structure programs in a more modular, independent way, reducing the amount of code needed, and simplifying the interaction of entities within the program. On the surface it appears that OOP simplifies program development while reducing errors and time needed; however, the paradigm is also criticised for not achieving these goals it was designed for (Martin, 2009). Another criticism is that, as a paradigm, it takes focus away from the actual computations or algorithms in applications, instead placing emphasis on data and object modelling (Bugayenko, 2017).

In recent years a new paradigm emerged that is based off of OOP, but with some changes in the concepts and principles that give it great advantages for development of programs in specific fields. The entity component system paradigm is very closely related to OOP and is thus still considered to be an object-oriented paradigm, only with a different flavour. The next section will discuss the entity component system paradigm, what its key principles are, how it differs from OOP and what it is best suited for.

5.5 Entity component systems The video game development industry has been the catalyst for many innovations that have changed the face of modern computing, ranging from algorithms and efficient mathematical calculations to physical hardware in the form of GPUs which give AI applications greater power to leverage over traditional CPUs (Martin, 2007).

Another recent development was that of the entity component system (ECS) paradigm, which was created to help game programmers to build video games in a more efficient manner that allowed for greater control over data and how entities within a game are constructed (Lord, 2012).

ECS designs, in comparison to OOP, favour composition over inheritance. Therefore, objects in an ECS program are composed of other objects, allowing for new entities to be built up of existing ones and for behaviour changes to be affected by adding or removing components (Bilas, 2002).

84

This approach is favourable in video game development due to the flexibility it provides for creating base components with specific properties that when combined can create new entities that take on the attributes of the constituent components (Bilas, 2002). The other advantage of this approach is that it eliminates complex inheritance hierarchies that would be present if inheritance was used, simplifying the code and making it easier to maintain (Martin, 2007).

The name of this paradigm alludes to the three important aspects it makes use of:

 Entities: these are the fundamental building blocks of the ECS paradigm. They represent a concrete object in the environment (in games, an in-game object such as a car or person), with each being its own instance and not containing any data or methods. It is ultimately a high-level container that has a unique identifier (Martin, 2007).  Components: components are aspects that an entity can take on. These are granular definitions of a single aspect (i.e. black, metal, round etc.). Components are used to label an entity as having a particular aspect and can be implemented using various types of data structures (Martin, 2007).  Systems: systems are parts of a program that run globally over everything in a continuous manner, performing some action on each entity that has a component of the same aspect as the system. In games, the general systems that exist are the rendering system, sound system, animation system, etc. (Martin, 2007)

The power in this style of programming is that it allows programmers to create objects that are defined by their components and take on properties from them that are enacted or realised through the use of global systems that handle the execution of code for all entities that have a particular component or group of components (Lord, 2012).

It allows the code for each system to be created and maintained independently from the other systems, and for instructions to be executed for all entities with a particular aspect, meaning that this code does not need to be created for each entity (Martin, 2007). Imagine defining code for 1000 person entities describing their behaviour compared to having it defined in a single system that applies it to all of them.

ECS paradigms have advantages and drawbacks. The main drawback is that they make data hiding impossible, as the data for components and entities are visible to all systems in the program (Bilas, 2002). In some cases, this isn’t necessarily a drawback. However, in security- focussed applications it is infeasible.

The main advantage of ECS is that it decouples data from the execution of functions on the data. This allows the systems to focus on the implementation of the code and how to

85 execute instructions, and for components to only store data. This contrasts with OOP objects, where data and code are coupled in the object (Bilas, 2002).

ECS implementations create an environment where entities can easily be built up or changed by altering the components they have, allowing for a flexible approach to modelling objects and their interactions.

Before diving into the specifics regarding ECS design and its components, the next section will highlight the reasons that the data-driven approach used by ECSs is preferable to the object- driven approach used in OOP. The implications of each will be discussed, along with how effective development of systems is impacted.

5.5.1 Data-driven vs object-driven In modern programming, the OOP paradigm is both popular and widely used, with programmers designing applications making use of classes and objects. Classes in OOP languages are generally designed to be used with inheritance, allowing new classes to be created by extending a base class, giving the new child class properties and methods from the inherited parent class.

In OOP, both the data and methods are bundled together inside the class. The idea behind this was to create a central container that needed to be maintained by programmers for important code and to reduce code duplication. Data attributes are defined in the class and are hidden from outside view, except through the use of an interface (Martin, 2007).

One of the problems that became apparent with OOP is that through large convoluted inheritance chains, specific code for a class would still need to be duplicated and changed in each child of the chain. The fragile base class problem is a well-known OOP architectural problem which illustrates the issues that inheritance can cause. In this problem, modifications to a base class which appear to be safe may cause a derived class which inherits from that base class to malfunction, leading to the ‘fragile’ base class (Mikhajlov & Sekerinski, 1997). This ultimately means that developers need to consider the effects on derived classes and test them whenever making changes to a base class. Many OOP supporting languages have since implemented solutions such as: making instance variables private to the defining class, making use of ‘override’ key words in the subclass to explicitly override the base class’s methods and more. However, the careless use of inheritance without proper design can still lead to problems. This actually complicated programs and made them much harder to maintain, especially considering that each class in the chain would now also have interfaces that needed to be changed. Figure 5.8 depicts the class relationship hierarchy created by OOP.

86

Figure 5.8 Object-oriented inheritance hierarchy for game design (Lord, 2012).

Data in the OOP classes also ultimately become a static lookup to a value defined before run time, reducing the ability to quickly and easily alter class attributes without effort and without altering code.

ECS approaches developed mainly in the game development industry as a solution to the problems noted above. Game designers needed to be able to change game objects, add new ones and alter game systems quickly in order to react to the needs for new content, fix issues, and balance game play (West, 2007).

Often after a game’s launch, the development team would need to make monthly additions or changes to the game’s content or mechanics, which didn’t give them a lot of time to reengineer large portions of the code. If class hierarchies or chains needed to be altered it would be a large undertaking, requiring a lot of time and many code changes in multiple places. Working in this manner created a lot of extra time pressure and gave more possibilities for errors to creep in to the code base, as multiple changes in multiple locations in the code base were needed (Martin, 2007).

In the early 2000s, game developers began investigating a better way in which to go about developing games and maintaining them. From this arose the entity component system design that is so widely used today, which is also known as entity systems or component systems.

87

The ECS approach changed the focus of application development. It went from being object- driven to being focussed on the data. This data-driven approach makes use of composition over inheritance, avoiding the complex mess that class inheritance hierarchies could become (West, 2007).

Composition allows for objects to be defined as a collection of properties, removing the method implementation code from the objects, which in ECSs are known as entities. Each property that an entity is composed of is a data point that can specify a value. These properties can include attributes such as size, location, colour and many more and are referred to as components (Martin, 2007).

Methods that act upon the entities are rather separated into systems, each of which affects an entity only if it has an attribute of a related type to the system, creating singular locations in the code base to maintain code and make changes or additions (Unity Technologies, 2019). These systems often include physics, rendering, etc., where entities composed with a physics component can have physics-related methods and behaviours applied to them.

This separation of code from data simplifies the management of code. It also allows for much faster game development and rapid prototyping to test ideas, as new entities can easily be defined through the use of data only, which, due to the systems, also define the entity’s behaviour, appearance, etc. Game designers can therefore make new game objects from already existing components without using any code and rather specifying the components and the values for the components, which can be done through a simple value editor (Unity Technologies, 2019).

Figure 5.9 shows what an ECS program structure looks like, with data in components separated from the methods contained in systems.

Figure 5.9 Entity component system design structure (Unity Technologies, 2019).

88

This section has shown how ECS approaches hold large benefits over those of OOP when large programs need to be developed which have many expected future changes. By using a data- driven approach such as ECS, designers can focus on defining entities from existing components, while developers focus on maintaining method code in the game systems.

5.5.2 Entities Entities, as stated previously in this chapter, are the high-level containers that represent objects; in game development these would be all types of game objects. In the case of the DE this would include the physical hardware, people and information.

In ECS designs, the data and methods are separated, with methods living in the different systems and the data existing in the various components. It is natural to try to draw parallels between OOP concepts and ECS concepts so as to try to help people understand what a concept in a new paradigm is by putting it in the context of an existing known concept (Martin, 2007). Therefore, many people may try to compare ECS entities to classes in OOP, but this is a major mistake.

Classes are inherently an OOP concept, where data and methods are bundled together. This is as far removed from what an ECS entity is as possible. Firstly, data and methods are separated in ECS, and secondly, entities do not store anything at all. Data is rather stored in components in ECS designs. This will be detailed further in the next section (Unity Technologies, 2019).

Entities are merely high-level groupings of components that define a type of object in the given environment. For instance, a car could be an entity in a game, but all it is is a collection of components such as wheels, doors, an engine, etc. Entities can therefore be seen as being labels for a collection of components, merely represented by a name (Dickheiser, 2006). Each individual entity does, however, need to be uniquely identifiable among other entities so that when the ECS systems need to perform methods, they can be done on the correct entities, individually.

For this reason, entities are normally represented using some form of global unique identifier, or GUID, which is stored in a central database that all systems are able to access. The ECS database, as it is generally referred to, stores a directory of all entities along with lists of their components. Because ECSs do not make use of data hiding through interfaces, etc., all entities and components can be seen and accessed by all systems (West, 2007).

For the sake of defining entities for the DE, entities will be considered to be uniquely identifiable lists of components (even though, in reality, the implementation of entities does not quite work this way; Chapter 10 will detail implementation specifics in more depth). Figure 5.10 shows the conceptual relationship between entities and components.

89

Figure 5.10 Entities are represented with numeric IDs and serve on a high-level as a list of components (Martin, 2007).

With entities being considered to be uniquely identifiable lists of components, it is difficult to define various entity archetypes that can be reused to create multiple entities with the same set of components, and only the component values differing.

5.5.3 Components Data is where all the power resides in the ECS approach, with entities and components being defined based on data, and systems acting on and manipulating data. With ECS approaches being data-driven, it means everything in the design is centred around the data: how to access data, how to change data, how to create new data, what the data represents and more (Unity Technologies, 2019).

Components are the objects in ECS designs where all the data is kept (obviously not all data, as data specifying how entities, components and systems interact reside in the DB defining the relational schema), creating a situation where simply composing an entity from a collection of components indirectly give the entity attributes and allow the system to perform methods on them (Dickheiser, 2006).

90

The data attributes which would traditionally be associated with a class in OOP are in ECSs rather removed from a central container where methods also exist and are grouped into modular containers based on what aspect they are related to. These modular containers are known as components and are used to denote that a particular entity has specific aspects and defines the attributes of the aspects (Buttfield-Addison et al., 2019).

Containers can therefore be thought of as consisting of two distinct parts (Lord, 2012):

 a classification label, and  a data structure.

The classification label serves to give a name to the component and helps to tag entities in the application as having a particular aspect and attributes encapsulated by the component.

The data structure is the part of the component that actually stores the relevant aspect attributes. This can take the form of an array, list, struct or any other applicable data structure that allows for the storing of a number of attribute values that can be identified from the other attributes. For instance, an array of values can be stored for a component where a predefined structure indicates that attribute1’s value is stored in array index 0, atrribute2’s value is stored in index 1, etc., allowing for easy future lookups to particular attributes for systems (Dickheiser, 2006).

In practice, this is generally achieved through the use of individual tables for each component, where the columns of the table include an ID for each component and a number of columns representing the attributes (Martin, 2007). A row in this table would represent a unique component data instance with individual values for each attribute, which ultimately provide their composed entity with these attributes.

Bridging tables are used to map unique entities and their names/descriptions to component types and the individual component data instances. This setup greatly increases lookup performance and makes it simple to use a query language (QL) to fetch the component data or a specific entity, all components of a particular type, or all values or a specific component attribute (Dickheiser, 2006).

Components are groupings of data for a given aspect for an entity. This makes it important when defining components to carefully collect the relevant attributes for the aspect and to not mix aspect attributes between components (Harkonen, 2019).

Therefore, the first step in defining the components for an ECS application is to clearly outline the unique aspects that entities need to be able to take on through their components. The aspects are used to explain what an entity is and how it interacts with the environment it exists in, forming the basis for all processing and method execution within the application (Bilas, 2002).

91

5.5.4 Systems In the world of ECS, systems keep the code logic of how to handle data separate from the actual data in modular sections dedicated to specific data through components.

In OOP, method code lives in the class along with the data, making it well suited for executing parts of code on only specific subsets of the data in an instanced manner. However, when OOP needs to try to execute code in a global context to operate on everything or which needs to be called from anywhere, this is where it falls short (Buttfield-Addison et al., 2019).

The OOP approach can still be effective in scenarios where ECS may be better, as long as they do not have deeply complex inheritance hierarchies with large numbers of object instances. As soon as the inheritance hierarchies become too convoluted, the act of trying to execute methods across large numbers of objects becomes very inefficient due to the multiple indirect references between classes (Martin, 2007).

The complex references and chains also make for a very inflexible program structure, where any big changes become impossible without rewriting a large part of the code base, or hacks are used to bandage the issue in the short term. Over time this causes the code base to bloat substantially and become extremely complex and fragile.

In OOP, methods are executed on a single object at a time, whereas in ECS, large blocks of data can be executed against all at once. With ECS systems, highly flexible program structures can be created where changes to logic are only enacted in the relevant system/sub-system and there is no need to alter the entire chain of inherited classes as you would in OOP. If an entity needs to be changed, components can be added or removed; if components need to be changed, they can be updated and only the systems concerned with that component need to be updated with it (Harkonen, 2019).

Systems in ECS each run independently on a continuous basis, performing actions on all entities with the relevant components. This results in an application where systems run their internal code against components one at a time, but on a continual basis (Unity Technologies, 2019). There is no need to worry about the execution flow for a specific object in numerous different scenarios, as the component will always be executed against when the concerned system runs.

Systems in ECS generally focus on a specific aspect of the entities. However, some components can be shared between systems as one or more systems require that component's data for executing code. There is, however, normally a one-to-one relationship between systems and available components (Lord, 2012).

With ECS, you do not need to worry about two or more systems needing to access a particular component at the same time, as the systems are made to run one after the other. Therefore, component data shouldn’t suffer from race conditions or need locking, as system A will run and handle the component data and only when it is done will system B then run and touch

92 the same component data (Dickheiser, 2006). This is also a drawback for ECS with regard to concurrency, as even though race conditions are eliminated, concurrent execution is also eliminated. This can, however, be addressed by allowing systems to operate only on a copy of the data instead of directly on the original data (Unity Technologies, 2019).

Systems define the logical operation of the environment: how items are updated, added, removed or used. These systems could perform something as simple as updating entity positions on a coordinate system, to something as complex as calculating predicted trajectories of objects through the air (Harkonen, 2019).

5.6 Conclusion Chapter 5 started by introducing the concept of programming paradigms, looking at what paradigms are, how they relate to programming languages and concepts, and what important concepts shape what a paradigm is suited for. Later, important programming concepts were discussed, along with their advantages and what they allowed programming languages to do.

The principle of data abstraction was defined, exploring the types of data abstraction and what they allowed programming languages to do with data.

Lastly, the paradigms of object-oriented programming (OOP) and entity component systems (ECS) were covered, with a discussion on what principles and concepts they implement and what advantages and uses they gain from these.

The AI and ML approaches presented in Chapters 3 and 4 are being used on a larger scale today than in the decades before. The techniques used require the processing of data to allow for decisions to be made regarding the targeted problem.

Traditionally, more established and widely used programming paradigms have been used to implement the various AI and ML solutions due to their wider acceptance and popularity, making it easier for developers to create applications.

An ECS approach, however, as detailed in Chapter 5, has major advantages when dealing with data-oriented problems, which most AI and ML problems are. This makes the use of ECS in the context of predicting the evolution of the DE very appealing and will offer benefits over OOP or other paradigms. This is covered in more detail in Chapter 8, where the ECS used in the proposed model is defined.

The information presented in this chapter is of relevance to the following secondary research questions:

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

93

This chapter has clearly demonstrated that modern programming paradigms can be used to achieve hugely varying results in computation. Some concepts have application in modelling entities within any environment due to the flexibility of data structures and how they can be manipulated. Entity component system architectures appear to have important advantages in representation and handling of code that make them ideal for constructing representations of real-world objects and having their properties defined by the components that they are made of. SRQ4 is therefore addressed sufficiently with regard to machine learning.

94

95

6 Problem background Part One of this thesis focused on the introduction to the problem, discussing at a high level the main issues that have created the need for the thesis and research into various fields. The background to the problem provided in the introduction was condensed so as to give enough detail about the problem in order to indicate the need for a solution, while at the same time not delving too deeply into the problem domain in a way that would detract from the objectives, research questions, and research methodology, which were defined for the remainder of the thesis.

Part Two of the thesis described literature related to the problem as well as the solution, with each of the literature review chapters taking a deeper look at the related fields. Each chapter in Part Two described the related field in detail, defining what the field is, what important concepts it is built on, what modern state of the art work is being done in the field, and what relevance the information detailed in each chapter has on the research questions defined in Part One.

Chapter 6 serves to paint a picture of the problem that is the main reason for this thesis, providing further understanding as to why a solution is needed and how the world is currently impacted, and clarifying the main components of the problem that led to the creation of the model that is defined in Part Three.

By clarifying the problem in this chapter, the hope is that the need for the proposed solution in Chapter 7 is made clear, along with the reasons for the design choice for the model and its components.

The first part of the problem that will be addressed is the continually changing world. Section 6.1 will describe the changes the world has gone through that have given rise to its current state, as well as the implications thereof.

6.1 The growing world The human species has been on earth for a relatively short time, but the human population is presently at an all-time high, with the majority of the growth occurring only in the past 100 years or so.

Compared to the earth, which scientists estimate to be between 4.4 and 4.5 billion years old, humans have existed for a relatively miniscule amount of time. Some estimates place the origin of man beginning at around 50,000 B.C, which according to Roser, Ritchie & Ortiz- Ospina (2019) would make the human species a mere 52,000 years old.

The life expectancy for humans back in less modern times was much shorter than now, with people at the age of 30 being considered old. Over the centuries, with the development of better hygiene and eating practices, and with advancements in farming and medical

96 technology, human life expectancy across all continents rose. In the early 19th century, no nation across the globe had a higher average life expectancy than 40, but fast forward 150 years and the average life expectancy in more developed countries had increased to around 60 (Roser, Ritchie & Ortiz-Ospina, 2019).

Currently, the average life expectancy of humans is at its historic peak of 71 years, almost doubling over the past two centuries. Further advances in medicine, the elimination of many diseases, and more reliable crop yield are the main contributing factors. This paired with the positive birth to death ratio has led to a situation where the human population on earth is at constant growth (Worldometers, 2019).

Figure 6.1 illustrates the human population plotted from 10,000 B.C. up until the present. It is very clear that population growth was fairly slow up until the 19th century, whereafter the population increased from 1 billion to 7.7 billion in a period of 200 years.

At the current rate of population growth, it is estimated by the United Nations that by 2057 the human population on earth will reach 10 billion.

The current birth rate is 2.3 times higher than the death rate, resulting in an estimated yearly growth in population of 82 million people, which is a yearly growth of 1.1% (Worldometers, 2019). This rate is slowing and has already halved from its peak of 2.07% in the 1970s, with further slowing expected over the following decades to a near complete stop in 2100 (at which time the population will have reached a size of 11.2 billion).

A large growth in population from the 1970s until present coincided with the birth of the Internet and more widespread use of computers over the decades. This has created a world in which more people than ever before have access to computers and the ability to create and share information, with the Internet facilitating the majority of communications, as well as housing vast amounts of information (Roser, Ritchie & Ortiz-Ospina, 2019).

More people in the world, means more people capable of sharing ideas, collaborating, creating and consuming information, placing the current rate of information creation at an all-time high.

As the end of this section alluded to, the growth in human population in modern times is not the only factor that created the DE of today. The simultaneous advancement of technology and its proliferation played just as important a role.

Section 6.2 will consider the advancements in technology as a factor in the creation of the DE as it is today.

97

Figure 6.1 Human population growth over the past 12,000 years (Roser, Ritchie & Ortiz-Ospina, 2019).

98

6.2 Advancing technology The creation of institutions to facilitate learning and further research into the understanding of the world is something that occurred long before the advent of universities and colleges as we know them today. As early as 387 B.C., in Ancient Greece, renowned scientists, inventors and philosophers formed schools that focused on higher learning. These institutions generally only served a small group of select students. However, the goal was the same as that of modern universities: to educate students and elevate their thinking so that they can better understand the world and thus improve it (DailyHistory.org, 2018).

Modern mass student universities only really began to take shape around 1080, with many more being created in the medieval period around the 13th century. This was the beginning of modern universities as are common worldwide today, with many of these institutions still existing today (Khan, 2018).

From their inception, places of higher learning strove to improve humans’ understanding of the world and to solve problems that the people of the time faced. It is no surprise then that universities are responsible for the development of many industries, technologies and methods used in the modern world.

Modern computers have their origins in simple calculating machines, invented to add numbers together more efficiently than humans could by hand. These early computers were mechanical machines that made use of gears, belts, shafts and punch cards to perform calculations.

As further research was done and a greater need for more powerful computational machines arose, the computer became an electronic machine capable of storing information for later use. These electronic computers made use of transistors and capacitors along with electric current to function. This is where the binary system used for information storage and manipulation was born. Modern computers are now many times more powerful and much smaller in size than those that were used by businesses in the 1960s, which took up entire rooms and even floors in office buildings (Zimmermann, 2017).

With improved circuit board production technology, small portable computers were made possible in the mid-1970s due to pioneering figures such as Bill Gates, Paul Allen, Steve Jobs and Steve Wozniak (Zimmermann, 2017). This brought about personal computers, making computer technology available to the general population to use at home, not only for business needs but also for entertainment.

Around the same period at the Massachusetts Institute of Technology (MIT), J.C.R. Licklider proposed a concept around creating an interconnected network of computers that would allow people from around the world to share information and communicate. He later became the head of the (Defence) Advanced Research Projects Agency, known as (D)ARPA, a military agency focused on research and development (Leiner et. al, 1997).

99

The birth of the Internet as it is today, started off as the ARPANET (ARPA Network), which was a network and set of protocols that allowed researchers to share information over large geographic distances using computers. This allowed them to share data for modelling phenomena and simulating events without needing to physically ship storage disks or tapes, which took much too long.

Throughout the creation of the ARPANET, a number of university research teams were involved, including MIT, Stanford University and University of California Los Angeles (UCLA). The teams from the different institutions contributed by developing data transfer protocols, authentication protocols, and network control protocols, most of which are the foundation of protocols used today (such as transmission control protocol, or TCP, which was specified in RFC 675 by Cerf, Dalal and Sunshine in December 1974) (Leiner et. al, 1997).

Over time, more and more computers became connected to the ARPANET, which eventually grew to not only include academic institutes and government departments, but also businesses and individual households. The Internet was formed and gave people a way to communicate without needing to be nearby each other, while also providing a source of information.

The continual advancement in computer and networking technology, as well as the protocols that run the Internet, has now created a world in which people of all cultures and languages can communicate and share information. This has allowed for collaboration on ideas, verification of facts and the broadening of people’s horizons. The modern internet has made the world a much smaller place than before, giving humans the ability to experience and learn things that occur on the opposite side of the earth.

Figure 6.2 illustrates the growth in internet usage by the human population over the past 30 years.

Another effect of the changing technological climate is the change to social interactions among people. Section 6.3 will discuss this area in further detail.

6.3 Evolving interactions Humans are gregarious by nature. They tend to seek out the company of others and interact in a social manner. This is one of the main factors that has led to the formation of communities throughout human history, and has shaped the cities and towns that people live in in the modern world.

Seeking out companionship stemmed from the need for safety considering the dangers faced by early mankind in terms of predators. This also made reproduction, and therefore the continued survival of the species, easier, allowing for the mixing of genes that could result in new generations with greater fitness, and therefore increased likelihood of survival and growth of the population (Elshaikh, 2016).

100

Figure 6.2 The growth in internet usage by population percentage (Roser, Ritchie & Ortiz-Ospina, 2019).

101

Forming communities allowed humans to share tasks and responsibilities vital to survival, such as gathering food, creating shelter and protecting the community from threats (Wittrock, 2001). Members of the community would interact with one another on a daily basis, communicating needs, common interests and tasks that needed to be done.

Over the centuries, as communities grew larger and members began to specialise in particular tasks, it became important for each member to learn to communicate effectively to trade and procure goods or services from others. With larger communities it was not possible for each person to know all others in the community, and so smaller social groups were formed between people in closer geographic proximity to each other and who shared similar interests or challenges (Wittrock, 2001).

In the modern world, social skills are necessary for individuals to function in society effectively. Learning to communicate and take interest in others and their lives has become necessary to assimilate and be accepted. These skills allow people to interact by exchanging goods and services without knowing each other personally and help individuals find groups with similar interests (Elshaikh, 2016).

As technology advanced, it became possible to communicate with other people over greater geographical distances, starting with written letters and evolving to telegrams and telephone calls over time. As technology advanced, the social circles that people kept were less limited by needing to be in close proximity. People could share information and experiences with each other from very different parts of the world.

The eventual development and proliferation of the use of computer and network technology allowed for cross-culture and cross-continent interaction on a scale never before possible and with latency that was comparable to being in the direct company of the person with whom one is communicating.

Further development in computer technology leading to mobile telephones and, eventually, the smartphones of today has made it easier and cheaper than ever for people across the world to communicate not only via text but also over voice calls and recordings and even video.

Online or digital communities have since developed due to the availability of internet access and the ease with which it can be used by people through their smartphones or computers. People from around the world form these communities around common interests, sharing ideas, discussing topics of interest and making new friends, ultimately expanding their social circles and interactions (Elhishi et. al, 2019).

6.4 Conclusion Chapter 6 has explored the problem background of this thesis further, highlighting three main areas of development that have contributed to the current state of the DE. The DE has

102 formed due to the amalgamation of social, technological and population changes that have taken place over the past century, with the biggest changes occurring in the past 50 years.

The growing human population on earth has been a major factor, creating more people capable of contributing ideas and views to society. In the past 50 years alone, the surge in population growth has placed the population at around 7.7 billion people of various cultures and backgrounds.

Technological advancements in the form of computers and networks has created the infrastructure that the DE is seated in. The Internet has grown and become more accessible over the past few decades, allowing people from around the world to communicate and share information. This has grown knowledge bases and allowed for people to experience cultures and events that they previously would not have been able to.

As the human population grew, the social dynamics of communities shifted from being based around geographical location, survival and necessity to being centred on common interests, backgrounds and ambitions. The social dynamic of the world has changed, with the Internet facilitating global communities of people that may never have met in person. The online persona of people changed from being fully factual to being based on how they want to represent themselves to the wider world.

The information presented in this chapter is of relevance to the following secondary research questions:

 SRQ1: What is the current understanding of what the Digital Environment is, and what does it encompass?

In Chapters 1 and 2, the problem background around the DE and what has led to its creation was covered briefly, with the focus rather being placed on defining what the DE is and what it is comprised of.

The DE was described as being an amalgamation of the physical, digital and cyber spaces, with technology, information and social interactions being key elements thereof. This chapter has highlighted how the main historical factors and their evolution over time has impacted the present world and shaped the DE as it is experienced today.

This lays a basis of understanding for what has created the DE and the fact that it is a complex entity created through the development of separate entities which synergised well with one another. This directly addresses SRQ1, providing further understanding of how the DE formed.

103

104

7 DEEv-MoS In the previous chapters, this thesis covered relevant background concerning related fields and the fundamentals of core concepts required to make the proposed model a reality. This chapter takes a look at the proposed model, Digital Environment Evolution Modelling and Simulation (DEEv-MoS), from a holistic stand point, giving a higher-level understanding of the model and of its constituent components. Subsequent chapters will take a more in- depth look at the model’s individual components and how they function together. Below is a brief summary of what was covered in the previous chapters.

Chapter 1 introduced the problem along with background information regarding the current state of the digital environment, which is continuously growing as the proliferation of computer technology and internet access increases. Due to the continual evolution of the digital environment it is difficult to understand where it is headed, with the addition of more devices, networks, people and information over time.

Due to the above, the main research question for this thesis was stated as:

Primary Research Question

RQ 1: Can an MAS, making use of machine learning and an entity component system design, effectively and accurately predict the evolution of a digital environment so as to provide assistance in understanding what the future of a digital landscape will look like?

This was split up into the following secondary research questions:

Secondary Research Questions

SRQ 1 What is the current understanding of what the Digital Environment is, and what does it encompass?

SRQ 2 Can an MAS be used to simulate the various components of a digital environment?

SRQ 3 Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

SRQ 4 Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ 5 Can ML, in particular extreme learning machines, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

SRQ 6 How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

105

Chapter 2 looked more closely at the DE in an attempt to define what it consists of and what factors in the modern world impact it as a whole, and hinted at what the future digital environment could look like based on what is currently being done.

For the digital environment to be modelled, technological factors such as computer hardware (storage space and computational power), network speeds and the impact of virtualisation need to be considered. From a cyber viewpoint, the information that is stored and transferred needs to be considered in terms of information type, size, storage location, and transmission medium. Lastly, from a social viewpoint, social networks, their growth and value need to be understood.

Chapter 3 focussed on the field of artificial intelligence, where the topics of intelligent agents, multi-agent systems and agent evolution were covered. The definition of intelligent agents was stated, providing a description and use for each of the main agent types, along with how to describe the task environment that agents operate in.

Multi-agent systems were defined as systems with multiple intelligent agents which could be of two different types: cooperative or competitive. Depending on the task environment and the type of MAS, different considerations need to be made by intelligent agent designers, especially in regard to agent goals and utility.

The evolution of intelligent agents was discussed, showing that inspiration from biological evolution has been made use of to achieve an equivalent of Darwinian evolution for digital entities called genetic algorithms. Making use of a population of agents along with a GA allows designers to solve types of optimisation and search problems, with each newly created offspring being a different solution to a problem.

Chapter 4 took a look at the sub-field of artificial intelligence called machine learning, in which techniques are defined to give computer systems the ability to learn how to solve problems. Deep learning, which is a popular ML approach in recent times, was also discussed.

The three main components of an ML algorithm, the experience, the task and the performance measure, were discussed along with the different types of ML algorithms and what their uses are.

ML algorithms on a high level can be split into supervised, unsupervised and reinforcement learning, where the first type requires careful crafting of data sets for computers to learn from, the second type allows computer systems to discover and learn from unprocessed data themselves, and the last type makes use of learning from action outcomes to reinforce what are good or bad actions.

A popular ML approach that is currently widely used for complex problems, namely deep learning, was defined and shown to have origins dating back to the 1950s. This type of

106 unsupervised ML algorithm makes use of a layered approach that is meant to simulate how human brains possibly function.

The remainder of Chapter 4 detailed feed-forward neural networks, a particular type of NN which has been effectively used real-world problem solving for decades. Extreme learning machines, a relatively new type of SLFN were discussed, highlighting how they differ from traditional SLFNs as well as their strengths

In Chapter 5, programming paradigms were discussed, defining what makes up a paradigm and the relationship between programming paradigms, programming concepts and programming languages.

Chapter 6 added further detail to the background of the problem that this thesis addresses. Real-world changes regarding technology, social interaction and population growth were described along with their impact.

Important programming concepts were defined, showing how each of them impacts the paradigm in which it is used and what implications its use has in a programming language. Data abstraction is a method of defining data structure and how the data can be manipulated, which is a key element of the majority of programming languages.

Object-oriented programming, which is one of the most widely used paradigms, was defined and its concepts and uses discussed. An entity component system, which is a modern paradigm initially adopted for game development, was defined as being a data-driven paradigm and shown to have advantages over other traditional paradigms by favouring composition over inheritance.

The remainder of the thesis is dedicated to the definition of a new model for predicting and simulating the evolution of digital environments, named DEEv-MoS, along with the implementation of a proof of concept prototype, Digital Environment Evolution Predictor (DEEP), the findings from DEEP, and a critical evaluation of the thesis.

The DEEv-MoS model is designed to allow for the creation of a prototype to model and simulate the evolutionary behaviour and changes that occur in a digital environment. In doing so, the evolutionary behaviour and outcomes of the digital environment can be predicted, giving insight into the environment’s future landscape and possible outcomes.

The next section will look at placement of the model from a deployment point of view.

7.1 Considering the DEEv-MoS model in the context of digital environments The DEEv-MoS model is designed to be considered in the context of a digital environment with its various components connected to the environment so that they can communicate with each other, as well as being able to transmit information to and receive information from the environment itself.

107

The various components of the DEEv-MoS model will need to function independently, while being able to make use of information from the given digital environment as well as from each other. It is therefore intuitive to consider the model and its components as being modular, where each can be built independently and can be easily switched out for various other types of that component. This will make it possible for each component to be developed separately and to test multiple variants of a component in order to observe which performs the given task best.

The next section takes a look at the DEEv-MoS model from a holistic viewpoint, whereafter each component will be covered individually.

7.2 the DEEv-MoS model The DEEv-MoS model can be viewed as a modular multi-component model, in which the model’s various components are each responsible for a specific task in simulating and predicting the evolution of a given digital environment. Even though these components are designed to be independent of each other, they do need to work together, by passing around information and results so that they can each function correctly. This will allow for spreading of the workload in simulating the evolution of the given digital environment, as well as generating the needed predictions.

Figure 7.1 The DEEv-MoS Model.

108

Figure 7.1 depicts the high-level architecture of the DEEv-MoS model, indicating the main components and the information flow between them in the model.

With this approach to visualising the model, it is intuitive to go about the designing and planning of each component individually, while taking into consideration that each needs to be able to communicate with a subset of the other components, as well as the digital environment itself. For ease of creating a digital environment that can be shaped to the needs of this thesis, the digital environment will be considered as a component of the model, whereas in the real world it would only serve as a reference for the model’s components to acquire information from.

As seen above, in Figure 7.1, the main components of the model are the environment, the entity component system, the constraints engine, the predictive modelling engine, and the events engine. Each of these components serves a particular purpose in the task of simulating and predicting the evolution of the given digital environment, which will be discussed further in later sections.

The remainder of this chapter will be concerned with taking a closer look at the various components of the DEEv-MoS model. This will cover aspects of each component such as what role the component plays in the model, which components it communicates with and for what reason, and possibilities for the implementation of the component.

7.2.1 The environment component The environment component is the main component of the DEEv-MoS model, as this represents the digital environment which the model will be attempting to simulate and predict the evolution of.

The DE on a large scale, can be defined as encompassing digital creations and spaces such as computer networks (especially the Internet itself), virtual worlds, large social environments such as social networking sites, physical devices ranging from personal computers to smartphones, and various other online communities centred around areas, such as blogs and websites (Frömming, Köhn, Fox et al., 2017).

The environment component will consist mainly of entities and constraints that keep the entities within some given bounds, restricting their behaviour in various ways. This component is responsible for actually simulating the evolution of the modelled digital environment, making use of information from the constraints engine, the predictive modelling engine, and the events engine to do so.

This component could be implemented as a sandboxed application that makes use of pipes for information transmission and reception with the other components.

The definition of the digital environment specifically regarding this thesis, will be elaborated on in the chapter on the environment component.

109

7.2.2 The entity component system component The entity component system (ECS) component is related to the entities within the environment component. This component defines how the entities within the given environment are structured and how they can interact with other entities.

Entity systems are an architectural pattern in which composition is favoured over inheritance. This provides many benefits such as flexibility in defining entity relationships, easy composition of entities from multiple base components, promotion of modularity and reusability. ECSs are most widely used in game development, as the above benefits allow for faster and easier game code development with minimal reengineering (Bilas, 2002).

The ECS defines what properties the various parts of an entity will have and allows for a modular approach to defining entities that can be built to evolve and change based on their base component types.

The ECS does not communicate with any of the other components, but it has an indirect relationship with the Environment component, as it directly affects how entities within this component are defined and how they can be created.

There are many different implementations of ECS with unique differences based on the specific problem domain within which they are used. In the chapter on the entity component system component, this will be explored further, specifically in relation to the DEEv-MoS environment.

7.2.3 The constraints engine The constraints engine component is responsible for defining and altering constraints on the environment component. These constraints will affect the operation of entities with the environment, as well as define bounds for their evolution and growth.

This component communicates constraint information directly to the environment component, which is the result of constraints calculations performed making use of information from the predictive modelling engine.

This component would need to be implemented as a computational engine that dynamically alters constraints in real time based on the predictive information it receives.

7.2.4 The predictive modelling engine The predictive modelling engine component is responsible for calculating predictions on the evolution and change of the given environment and its entities. The predictions it makes will need to be matched against the real evolution of the system in order to determine their accuracy and feedback so that they can be improved.

This component will communicate with both the environment component and the constraints engine directly regarding the predicted outcomes of the environment.

110

This component will need to be implemented as a computational engine coupled with machine learning models that can be interchanged and are defined using a well-supported format such as R-script (The R Foundation, 2020).

A possible candidate for the machine learning model component of the predictive modelling engine, is an extreme learning machine (ELM), which is an approach for single-hidden layer feedforward neural networks (SLFNs). The ELM approach is designed to perform well when dealing with both natural and artificial phenomena that are difficult for more traditional techniques to handle (Huang, Zhu & Siew, 2006).

7.2.5 The events engine The events engine component is the component of the model that is responsible for generating events that will affect the speed and direction of evolution and growth within the digital environment.

This component will communicate directly with the environment component to create events that stimulate change when stagnation occurs or promote faster or slower evolutionary changes based on the environment’s constraints and entities.

This component will need to be built as a monitoring application that makes use of the environment’s constraints and rate of change to determine when to create an event, and what the event is supposed to do.

7.3 Conclusion Chapter 7 has considered the DEEv-MoS model on a high level, where the general structure and components were defined with brief definitions. The information presented here is only partially relevant to the secondary research questions below:

SRQ1: What is the current understanding of what the Digital Environment is, and what does it encompass?

SRQ2: Can an MAS be used to simulate the various components of a digital environment?

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ1 was briefly touched on in the discussion on the environment component of the model, but will be fully answered by the specific chapter on this component.

The use of an entity component system for defining the structure and functioning of the various entities within a digital environment, appears to be promising in answering SRQ2, as it will allow for component-wise construction of larger entities in a flexible and customisable manner.

111

The models and algorithms that will be used in the predictive modelling engine component of the DEEv-MoS model, will strongly influence the answer to SRQ4. Considering the use of agents in an ECS-architectured system, it allows for the implementation of a machine learning approach such as ELM, which would be capable of effective evolutionary predictions regarding the entities within the digital environment itself.

Now that an overview of the model and its components has been given, the following chapters will go on to discuss each component in more detail. This will include more specific aspects such as architectures, paradigms and models that can be used for each component, as well as specific concerns regarding data and communication. These chapters will become largely more technical than the previous ones, as they will delve into the components in depth, describing different techniques, comparing these techniques to one another and motivating the final choices made. Code samples will be used, where possible, to illustrate certain functions or how components will be implemented.

112

113

8 DEEv-MoS: digital environment The focus of this thesis is on the creation of a model for predicting the evolution of the digital environment. In the previous chapter, a high-level description of the proposed model, Digital Environment Evolution Modelling and Simulation (DEEv-MoS), was given, detailing the various components of the model, and indicating their relationships to one another. Figure 8.1 highlights the component of the DEEv-MoS model that will be covered in this chapter, the digital environment component.

Figure 8.1 The digital environment component of the DEEv-MoS model. The various components of the DEEv-MoS model each deal with an aspect of predicting the evolution of the digital environment; however, the key part of this model is the digital environment itself.

Without a clear definition of the digital environment that will be evolved, it is difficult to define the other components of the model, as they rely on having existing knowledge of the environment in order to make assumptions on how it functions and, in turn, how they will each function individually.

114

The problem background presented in Chapter 1 makes it clear that the nature of the DE is that of flux and constant change. This is apparent when considering what the DE is composed of and how rapidly the technology and social landscapes are changing and growing. The modern world is one in which technology and its use is on an ever-increasing rise, and where almost all aspects of a person’s daily life is impacted by this technology.

People in today’s modern computer-driven age make use of technology hundreds if not thousands of times a day, with location of use starting in their homes and expanding to encompass travel and their places of work. A trip to the shops, the doctor, a holiday resort or even the bathroom can involve interaction with technology on some level.

With all the interaction with technology, an enormous of amount of data is generated and stored, whether it is locally on a device such as a smartphone, watch, camera system, etc., or remotely on a server that resides on the other side of the world. Along with the raw data people create, they also build an online cyber-persona or identity, which over time expands to contain more information about the person and their interests, history and beliefs.

Reiterating the definition given by Laifa, Akrouf and Mammri (2015), all of the above data and infrastructure form the entity known as the DE. To understand this complex mass of data, devices, people, concepts and ideas, let alone predict how it will evolve, is a tremendous undertaking due to its massive scale and the proliferation of technology world- wide.

8.1 the Digital Environment vs a digital environment As was shown in the beginning of this chapter, the Digital Environment (DE) is a huge entity that encompasses the entire world. It encompasses the Internet in its entirety, including the information that lives on it. Physical hardware and people also form part of the DE from the tangible world and must be considered when understanding the DE as a whole. Figure 8.2 depicts what the DE as a whole encompasses.

For the purpose of this thesis, it is important to understand the scope of the digital environment that will be used in the DEEv-MoS model. To attempt to predict the evolution of the DE as a whole would be a massive undertaking and would require mapping out of all existing networks, connected devices and stored data.

The DE is the amalgamation of the physical, logical and abstract digital entities that exist in the modern computer-driven world. It encompasses almost all aspects of daily modern life and extends the persona of people into the cyber-realm. As stated in Chapter 2, this thesis will consider the evolution of the DE in terms of data communication systems and their components rather than the broader and holistic CPS view. This includes personal or business computers and infrastructure as well as cloud based hardware; even though cloud based infrastructure and data is physically removed from the point of access or use, they still form part of the DE network and the interactions taking place across it.

115

A digital environment, on the other hand, is a small subset of the greater DE, encompassing much of the same entities, only on a smaller more manageable scale. This thesis will deal with using a digital environment for the model and prototype. However, in practice, it could be scaled up to include much more of the DE.

The evolution of a given digital environment occurs and can be measured through the increase in its network size, data generation and volume of interactions. This is restricted to a view of the DE and its evolution from the point of a communication network rather than a complex system.

Complex system simulation requires the consideration of not only static behaviours but also dynamic and incidental ones. To model the dynamic and incidental behaviours and their impact on the DE would require further research and the modelling of further subsystems and socio-economic factors, which are out of the scope of this thesis.

The digital environment that will be used for the DEEv-MoS model needs to be defined before any of the model’s components can be detailed, as these components will need to make use of the entities and consider the properties of the digital environment in order to operate effectively. The next section will define the entities that the DEEv-MoS digital environment will consist of, as well as considerations that must be made for each.

8.2 DEEv-MoS digital environment entities In the DEEv-MoS digital environment, an array of entities needs to exist that reflects those that exist in the DE. These entities and their relationships will shape how the digital environment can evolve and how it is represented, and will show the results of the evolution of the system.

This array of entities needs to consist of entities that span from the physical world to the digital world, just as in the DE, so as to give a comprehensive view of what is to become of digital environments in the future.

The following entities will be used in the definition of the DEEv-MoS digital environment and will need to be implemented in the prototype system to test the model:

 network node/server,  network paths,  fixed endpoint devices,  wireless endpoint devices,  users (agents), and  data/information.

The number and variety of entities in the DE is much greater than in the list above. However, these base entities should allow for the DEEv-MoS model to effectively simulate how a digital environment evolves over time. The list may seem fairly self-explanatory, but

116 there are some considerations that need to be made in the DEEv-MoS model based on each entity. Therefore, each entity will be defined and described in further detail, including whether or not there are differing types of each entity, what their function is, and how they affect the DEEv-MoS model. Figure 9.2 depicts the DEEv-MoS digital environment, the different entities it contains and the structure in the digital environment.

The first entity that will be discussed is that of users or agents.

8.2.1 Users (agents) In the real-world, people are a very important entity to consider in the DE, as their behaviour and daily interaction shapes the landscape, adding and removing data, creating new network connections and endpoints, and building social networks with other people they know.

People are also responsible for the advancement of technology, which is a big part of the evolution of the DE. However, to model that kind of behaviour would be to model human behaviour and intelligence as a whole and would be the topic of its own thesis.

For the purpose of the DEEv-MoS model, users will be modelled as entities that communicate using networks and endpoint devices. They will also create and delete data and make copies of it, mirroring the high-level behaviour of people on the Internet in general.

Users will be represented as intelligent agents in the model, where they receive percepts from the environment and make use of them along with their goals and performance measures to determine actions.

The agents will have the ability to perform the following actions:

 communicate with another agent,  create data,  delete data, and  copy data.

In terms of connecting to a network, the people agents will have two different means via which they can achieve this: fixed endpoints and mobile endpoints. Each of these will be detailed further in its own section.

Agents will mould the data landscape in the digital environment, creating new sources of information and giving priority to better networks based on location and connection.

The agent type that is most likely the best fit for this scenario is the utility-based agent, as a utility measure can be defined to determine the happiness of agents, which will be used to determine the actions that the agents will take.

117

Figure 8.2 The DEEv-MoS digital environment and entities. 118

8.2.2 Data Whether considering the DE or a specific digital environment, the data that is generated, stored and accessed is the backbone of the environment.

Computer and social networks are set up to enable sharing of information and communication, thereby generating large amounts of data. This is, therefore, a very important aspect to model in the DEEv-MoS digital environment, as it drives the behaviour of agents and shapes the networks.

For simplicity’s sake, data in the DEEv-MoS digital environment will be represented by two dimensions: size, and type.

This allows the model to simplify specifics about real data while maintaining the value of different types of data, as well as the size, which is a factor in data storage and transmission. At this point, the size dimension of data is a generic placeholder which will be specified later in the DEEP prototype in Chapter 13.

Data will be created by agents (people) and stored on servers or endpoint devices. Data can be copied from one source to another and can also be deleted by agents.

8.2.3 Network node/server Network nodes in the real world can range from being a large grouping of dedicated servers that store data and load balance between themselves in order to serve the data to endpoint users, to being single machines that are used to store data of a smaller size and run some business processes.

In the DE, network nodes enable the creation of a digital landscape where people can share and store information, as each node is a source of information and in many cases serves as a backup for other nodes, creating a level of robustness.

Network nodes create shared stores of data that can then be accessed by anyone connected to the network from any physical location (as long as their endpoint device can connect to the network). This gives people the option to access data across a variety of devices without needing to have the data on a device that they carry with them all day. This trend can be seen in the move towards cloud computing and storage, where data is stored and computation is performed on a set of remote servers designed for this purpose, rather than on a person’s local device.

In the DEEv-MoS digital environment, network nodes will serve a similar function as discussed above, allowing agents to store information on and access information from any network-connected endpoint device.

Larger amounts of data will be stored on network nodes, and can be added to by agents, retrieved to endpoint devices and copied to other network nodes in order to increase data availability. Network nodes in the DEEv-MoS digital environment will be characterised by:

119

 physical location (fixed),  size (storage),  power (computation), and  connections (maximum concurrent).

Depending on the size of the network node, it will only be able to serve so many agents at a given time, making it important for data to be duplicated on and accessible from more than one node.

8.2.4 Network path When creating a network of computers, the network paths between the various machines in the network are important. They create the connectivity to form the network and, depending on the medium used, determine limitations of the network.

When considering the hardware that is used to create a network in the DE, network paths form the link between computers and allow for communication and data transfer. On a very simplified level, most network paths make use of some physical cabling that runs from one computer to another, and the type of cable determines the speed of the connection and how far it can be sent.

Some network paths cannot be created using physical cabling due to geographic or environmental constraints, so wireless links are used. Wireless links include the use of radio waves and even satellites to project data across large distances.

Network paths are one of the components of the DE that experience the most change, especially with the advent of wireless technology, as connections are made and terminated from mobile devices from almost any physical location.

In the DEEv-MoS digital environment, network paths will be used to indicate connections between nodes and endpoint devices in the environment, allowing for communication and data activities to take place. To differentiate between differing mediums and capabilities, network paths will have the following properties:

 medium (wired/wireless),  bandwidth (data through put), and  maximum distance.

This will allow the DEEv-MoS model to effectively model the connectivity between different physical devices in the digital environment, and to create realistic scenarios based on network layout and pathing options.

120

8.2.5 Fixed endpoint device In any type of network there are endpoint machines that are used by people to interface with the network; send communications; and create, delete or copy data. Traditionally, these endpoints would be desktop computers that have a fixed physical location.

These fixed location endpoints would exist in people’s homes or offices and would very rarely, if ever, move. If a person wanted to access data or communicate with others on the network, they would need to make sure they went to the physical location of the endpoint computer and do it from there.

Fixed location endpoints are, just like large network nodes, generally connected to the network via a fixed cable-based connection, which limits the flexibility of moving the endpoint’s location. In the DE, these fixed location endpoints tend to store personal or business data which is not replicated on the rest of the network. However, copies of other data from servers would also exist on the endpoints due to people accessing and downloading it.

In the DEEv-MoS digital environment, fixed endpoint devices will represent these types of computers connected to a network, where they do not move location and have a wired connection that links the device to the network, often in a continual fashion.

Fixed endpoint devices will be represented by the following properties:

 physical location (fixed),  size (storage),  power (computation), and  connections (fixed, wired).

This will allow the model to account for these personal/business devices that tend to have long-term connections to a network and do not move physical location often, if ever. These endpoints will also house data specific to a certain agent, as well as copies of data from network nodes.

8.2.6 Mobile endpoint device As technology evolved and a need for flexibility in connecting to networks was created, mobile endpoint devices became more popular. Unlike their fixed location counterparts, mobile endpoints have their own power supply in the form of a battery and are compact.

Mobile endpoint devices were then designed to make use of wireless transceivers, allowing them to make use of different forms of radio waves to communicate and connect to existing physical networks.

With further needs to access data at any point in time and from any location, mobile devices were improved to have greater data storage capacity and computational power.

121

In the DE, these devices include smartphones and mobile compact computers such as laptops. These devices do not need fixed cable-based connections and can therefore connect to the network from any physical location (within range of wireless transmitters). These types of devices are extensively used worldwide, with the number of mobile smart devices outnumbering the number of people.

Mobile devices generate large amounts of user data and are a primary means of communication in the modern world. Pictures, videos, text and telemetry data are some of the main types of data generated and are stored locally on the device. However, more and more remote storage services are used so that people have the ability to create and store more data without relying purely on the device’s specifications.

In the DEEv-MoS model, mobile endpoint devices will be represented in a similar manner to fixed devices:

 physical location (dynamic),  size (storage),  power (computation), and  connections (mobile, wireless).

This will allow the model to effectively model the interaction of agents on the move, taking into account current trends in mobile device usage. Agents will be able to access, generate and delete data making use of mobile endpoints, and will create a dynamic network with mobile endpoints connecting to the network on an as-needed basis and from changing locations.

The six entities defined above, along with their properties and relationships, define the DEEv-MoS digital environment. The relationships between the entities of the DEEv-MoS digital environment are depicted at a high level using UML in Figure 8.3.

122

Figure 8.3 UML component diagram describing the relationships between entities in the DEEv-MoS digital environment (created with draw.io). To effectively implement each entity in the DEEv-MoS digital environment, it is important to understand the properties of the environment. The following section will describe the DEEv- MoS digital environment properties.

8.3 DEEv-MoS digital environment task environment When defining an environment within which intelligent agents will operate, it is important to be able to describe the environment so that the design and implementation of the agents can be effective for the given environment.

In Chapter 3, intelligent agents were defined along with the environments that they operate in. For agents to act rationally, it is important for them to be prepared for the task environment in which they will exist. This shapes the design of the agents, their performance measures and the actions that they can take. Task environments can be described on a high level by using the PEAS description, which describes the environment in terms of four important aspects. The next subsection will look at the PEAS description of the DEEv-MoS digital environment.

123

8.3.1 PEAS description The PEAS description of any given task environment describes the environment in terms of the following fours aspects (Weiss, 2013):

 Performance: what is the performance measure to which agents must aspire?  Environment: what will the agent encounter in the environment, and what is the environment?  Actuators: what actuators do agents have access to in the environment, and what actions can they perform with them?  Sensors: what sensors do agents have access to in the environment, and what information can be obtained from the environment through the sensors?

In the DEEv-MoS digital environment, only one type of agent will exist, which fulfils the role of a person in the environment. These agents will reproduce and create new offspring, while older agents are terminated. The agents will operate in a digital environment in which entities representing networks, endpoints, network paths and data exist. Agents in the DEEv-MoS digital environment will be driven by a utility function that is tied to the access and creation of data in the environment, with agents capable of creating, deleting, copying and viewing data.

A PEAS description of the DEEv-MoS digital environment provides information regarding the main agents that are present in the environment. In this case the people agents are the main type of agents that exist and operate in the environment. The interactions of these agents with each other and with the infrastructure entities shape the growth of the environment through the initiation of communication and forming of online communities in the model. With greater interaction and a larger need to store and communicate data efficiently, infrastructure growth occurs and so does data volume.

The DEEv-MoS model makes use of an ECS to structure the relation between components of the DE and allow for a data driven approach, which aligns with the data and communications focused evolution of the digital environment. The PEAS description denotes how an important component of the ECS digital environment is able to interact and ultimately shape the environment’s growth.

Based on the summary description above, the PEAS description for the DEEv-MoS digital environment regarding the people agents is as follows:

Agent Type Performance Environment Actuators Sensors Measure

People Store data, Digital Create data, Endpoint device, access data when environment, delete data, copy data storage, needed, copy other agents, data, access data access. data from server, network nodes, endpoint,

124

copy data to endpoint devices, communicate server, network paths, with another communicate data. agent. with other agents.

The PEAS description indicates which different aspects need to be taken into consideration when designing and implementing the people agents in the DEEv-MoS digital environment.

Agents in the DEEv-MoS model are intended to replicate the interactions between people and/or machines in the DE in the form of data communication.

Further specifics regarding the environment are, however, required in order to design effective agents and entities. There are a number of important properties used to describe a task environment as a whole and identify the unique traits of that specific environment. These properties were defined in Chapter 3 and consist of seven separate descriptors which define aspects ranging from how time is dealt with to how many agents exist.

For a full recap of the properties and what they represent, please refer back to the “Task Environment” section of Chapter 3. In the following section, the seven properties of a task environment are summarised and a definition for the DEEv-MoS digital environment properties is provided.

8.3.2 Task environment properties Task environments vary substantially from one to another, each having specific characteristics that need to be accounted for. To describe a task environment’s characteristics effectively, seven properties are used, each addressing an important aspect of the environment that needs to be accounted for when designing agents and systems.

The seven properties are listed below with a short description of each (Russel & Norvig, 2010; Weyns & Michel, 2014):

 Observability: how much of the environment’s state is accessible to the agent through its sensors.  Agents: how many agents exist in the environment.  Determinism: what is the certainty of outcomes resulting from agents’ actions in the environment?  Episodic: do agents’ actions affect the future states of the environment or only the current state?  Static: does the environment change during the time it takes for an agent to make a decision on an action to perform or does it wait for the agent?  Discrete: how is time handled in the environment with regard to state? Are there clearly defined states at points in time or do they flow into each other?

125

 Known: what knowledge does the agent have of the effects of its actions in the environment?

According to the above properties, the below description of the DEEv-MoS environment is provided, where each aspect is addressed along with an explanation:

Property Type DEEv-MoS Property Reason

Observability Partially observable Agents can perceive data that is connected to their endpoint device or on a server they communicate with. However, they cannot see all data, agents or nodes that exist at all times in the environment.

Agents Multi-agent There are multiple people agents present in the digital environment. The type of multi-agent relationship is that of partial cooperation as it is in the best interest of all agents to follow the rules of using a network. Agents do, however, compete for resources, making the environment partially competitive at the same time.

Determinism Stochastic The result of an agent’s actions can be determined on an atomic level local to an endpoint or network node. However, the outcomes in the bigger picture are not known due to the emergent complexity of such an environment. This leads to computational irreducibility regarding the outcomes in the environment as a whole.

Episodic Sequential Interactions that agents have with the environment and its entities directly impact what the state of those entities are

126

in the future. The outcomes of an agent’s actions are not isolated to only that point in time.

Static Dynamic Multiple agents are acting in the environment at the same time, with their actions affecting the state. This means that an agent cannot be sure that the state is the same from when it began deliberating to when it performs an action.

Discrete Continuous Time in the DEEv-MoS digital environment runs in a continuous manner, with there not being finite defined states that agents can arrive at. With current plans for a single instance application, synchronization should not be an issue.

Known Known The agents know the rules and the actions available to them in the environment; they are provided with prior knowledge of the environment.

8.4 Conclusion Chapter 8 has taken an in-depth look at the DEEv-MoS digital environment, defining the various entities that it is comprised of and what relation they have to one another. Six entities were defined that encompass the different roles present in the Digital Environment on a high-level, as to model all entities and relations would be unrealistic for the scope of this thesis.

To understand how best to design the entities of the DEEv-MoS digital environment it is important to describe the environment itself in terms of the expected important aspects and properties. The task environment was described on a high level using the PEAS description, which defined the performance measures, environment, actuators and sensors that would exist in the DEEv-MoS digital environment. In order for designing agents to operate effectively in their task environment, more detailed definitions of certain properties are required.

127

The seven main properties of the DEEv-MoS digital environment were defined to address this. These properties describe how the environment functions and what considerations need to be made when designing and implementing agents to operate in the environment. The task environment properties were listed in a tabular format, stating the property along with the reasoning for the property being selected.

The information presented in this chapter is of relevance to the following secondary research questions:

SRQ1: What is the current understanding of what the Digital Environment is, and what does it encompass?

SRQ2: Can an MAS be used to simulate the various components of a digital environment?

In Chapter 2 the DE was introduced. It was shown that in the present modern, computer- driven world, the DE has grown over time to encompass physical, logical and abstract digital components. This includes the Internet, all of the data it contains, the physical infrastructure that it is made up of, as well as the abstract concepts and cyber personas created on it.

For this thesis, considering the entire DE would be too large an undertaking. Therefore, a smaller scale controlled digital environment is needed to test the DEEv-MoS model with. It must be noted that the reductionist view on the DE vs a digital environment is a limitation of the current DEEv-MoS model and will apply only to cases where the components are considered to form linearly composable systems within the DE. Section 8.1 discussed the difference between the DE and a digital environment. In this chapter, the DEEv-MoS digital environment was defined, describing the six main entities from the DE that have been chosen to be modelled. These six entities, along with their associated data and relationships between each other, define the digital environment that will be used in the DEEv-MoS model. The six entities modelled in the DEEv-MoS digital environment are:

 users (agents),  data,  network nodes,  network paths,  fixed endpoint devices, and  mobile endpoint devices.

SRQ1 is therefore addressed sufficiently with regard to the DEEv-MoS model, as a specific digital environment is scoped and defined.

One of the main six entities that is incorporated into the DEEv-MoS digital environment is agents, which are used to represent people in the DE. The agent entity is extremely

128 important for modelling how a digital environment evolves over time; in the real world people, their behaviour and their actions shape the DE.

In the DEEv-MoS model, multiple agents exist to correctly represent how in the real world multiple people interact with the DE. The DEEv-MoS digital environment is therefore an MAS, with each agent operating independently of the other agents. Section 8.3 defined the properties of this MAS.

The DEEv-MoS digital environment can be considered to be a partially collaborative, partially competitive MAS, due to agents competing for resources (data and network nodes) but also needing to follow rules that benefit all agents.

SRQ2 is therefore answered, since the DEEv-MoS digital environment can be considered to be an MAS. This MAS contains different entities, each of which is designed to replicate an important component of the DE in the real world.

Chapter 9 discusses the entity component system component of the DEEv-MoS model in detail, defining the purpose of this component and the types of entities, components and systems that will be made use of.

129

130

9 DEEv-MoS: entity component system Part One of this thesis consists of a literature review which covers important related topics for the purpose of providing background to the proposed model, DEEv-MoS. The literature review gave detail on areas of interest that are used to define the components of the DEEv- MoS model.

Chapter 5 of the literature review concentrated on programming paradigms; the concepts used; and relationships between programming languages, concepts and paradigms. Widely- used concepts were described along with the abilities they unlocked, as well as their consequences. Some popular paradigms in modern software engineering were discussed, highlighting their unique combinations of concepts and what they are suited for. Figure 9.1 highlights the component of the DEEv-MoS model that this chapter will cover.

Figure 9.1 The entity component system component of the DEEv-MoS model. The entity component system paradigm, which is relatively new (Bilas, 2002) compared to others, was one of the paradigms discussed. It was shown that, in comparison to the popular OOP approach currently used in many applications, ECS designs are better suited for data-driven applications by decoupling the data from the methods and entities themselves.

131

Figure 9.2 UML describing the DEEv-MoS entity component system.

132

The ECS approach therefore seems appealing for simulation applications as well, where multiple entities exist that can have specific properties which change their behaviour in the simulated environment. For this reason, the DEEv-MoS model makes use of an ECS pattern to enhance the ability of the DEEv-MoS digital environment to model various types of entities from the DE.

Chapter 9 is focussed on the DEEv-MoS entity component system, defining the main three aspects of this pattern, along with their mappings to the digital environment that was defined in Chapter 8. The main three aspects of the ECS are:

 entities,  components, and  systems.

The relationships of these aspects are described, along with the properties that will need to be taken into consideration for entities in the DEEv-MoS digital environment.

The three important aspects of an ECS design are discussed in sections 9.1 – 9.3, starting with entities. Figure 9.2 depicts a UML component diagram of the DEEv-MoS entity component system.

9.1 Entities Entities, as stated previously in this chapter and in Chapter 5, are the high-level containers that represent objects. In game development, these would be all types of game objects, and in the case of the DEEv-MoS digital environment, it would represent all entities within the environment.

Chapter 8 detailed the DEEv-MoS digital environment, describing the entities that would need to be modelled in the environment and what attributes need to be associated with each one in order for it to function correctly. Entities in the DEEv-MoS digital environment are representative of real-world entities that form the DE, with each having a specific type of relationship with the others.

The main entity archetypes that will be used in the DEEv-MoS model ECS are merely equivalents of those defined in Chapter 8 for the digital environment. There are six entity archetypes, namely:

 agents,  data,  network nodes,  network paths,  fixed endpoint devices, and  mobile endpoint devices.

133

The above six entity archetypes will be used as templates for actual entities in the ECS; remember there are no classes, so these are not classes to be instantiated into new objects for each entity. Numerous actual entities of each archetype can be created in the ECS, each one with a GUID and related list of components.

Besides the GUID, the thing that differentiates entities of a specific archetype from one another is not the components they are made of, but rather the values of each of the components. Components and their values will be discussed further in section 9.2, but for now we will use some high-level component descriptions to define each of the ECS entities listed above, along with their components.

Each entity in the DEEv-MoS ECS will require a render component which, based on its values will, determine how the entity is rendered graphically in the simulation. The other components that entities of the different archetypes will be composed of are as follows:

 Agent: o Location component: this component defines the agent’s location in the environment space and on the simulation rendering. o Fixed endpoint component: this component defines the fixed endpoint the agent has access to. o Mobile endpoint component: this component defines the mobile endpoint the agent has access to. o Data behaviour component: this component defines what type of data behaviour an agent makes use of (i.e. creator, consumer etc.).  Data: o Location component: this component defines the data’s location in the environment space and on the simulation rendering. o Data type component: this component defines the type of data (i.e. video, image, text etc.). o Data size component: this component defines the size of the data (in terms of file size). o Author component: this component defines the author agent of the data.  Network node: o Location component: this component defines the node’s location in the environment space and on the simulation rendering. o Size component: this component defines the storage size of the node (in terms of disk storage size). o Power component: this component defines the computational power of the node. o Connections component: this component defines the number of connections a node can have to other nodes or endpoint devices. o Node type component: this component defines the type of node (i.e. computational, storage, replication etc.).

134

o Data type component: this component defines the type of data that the node handles (i.e. video, image, text etc.).  Network path: o Connection type component: this component defines the type of connection (wired or wireless). o Connection distance component: this component defines the maximum distance the connection can span. o Connection speed component: this component defines the bandwidth of a connection. o Endpoints component: this component defines the endpoints that the connection is established between. o Fixed endpoint device: o Location component: this component defines the endpoint’s location in the environment space and on the simulation rendering. o Size component: this component defines the storage size of the endpoint (in terms of disk storage size). o Power component: this component defines the computational power of the endpoint. o Connections component: this component defines the number of connections an endpoint can have to other nodes or endpoint devices. o Mobile endpoint device: o Location component: this component defines the endpoint’s location in the environment space and on the simulation rendering. o Size component: this component defines the storage size of the endpoint (in terms of disk storage size). o Power component: this component defines the computational power of the endpoint. o Connections component: this component defines the number of connections an endpoint can have to other nodes or endpoint devices.

The above entity archetypes represent the structure that individual entities in the environment will have. Each entity will merely exist as a database entry that maps entity GUIDs and archetype names to components, making it simple to look up all entities that have a specific name or contain a specific component.

In ECS designs, as stated before, all of the data resides in the components, while none of the methods do. Components are the main building blocks of ECS systems due to the data- driven nature of the paradigm and the fact that data resides in components.

Components define the properties that an entity will possess and ultimately how the entity will behave. Section 9.2 takes a detailed look at components, defining components in the DEEv-MoS ECS, what attributes they store and what the intention of each is.

135

9.2 Components The important aspects that entities in the DEEv-MoS ECS may possess include aspects that describe the ‘physical’ attributes of the entity, as well as aspects that determine its behaviour.

The following are the main aspects that will be considered in the DEEv-MoS ECS to shape its entities and model the corresponding properties and behaviours from the related real-world DE entities:

 renderable,  location,  data behaviour,  storage,  computation,  connectivity,  connections,  devices, and  data.

Note that when referring to ‘data’ in the context of component types, this refers to the data entity that is used to model arbitrary data files in the DEEv-MoS digital environment. It does not refer to the data portion of an ECS which is separated from the logic and stored in components.

Each of the above-mentioned aspects will be created as components in the DEEv-MoS ECS that can be composed to an entity. Each of the listed components can, in turn, consist of one or more attributes which describe the aspect that the component represents.

Component attributes can range in data type from integer or float values to strings or even images. It becomes important later on to understand the attribute data types when designing the systems that will act on or use the attributes, as the systems need to know how to handle the data type, and the data type also shapes what actions can be performed on the attributes.

A breakdown of the DEEv-MoS ECS components is given below, along with their relevant attributes, data types and descriptions. Attributes are represented in the following format:

attribute: [type] description.

The components and their attributes are as follows:

 Renderable o Colour: [hexadecimal] colour of item in simulation rendering o Shape: [enum] type of shape used to represent item in simulation o Size: [float] size of object in terms of radius/length

136

 Location o X: [float] X coordinate of item on simulation plane o Y: [float] Y coordinate of item on simulation plane  Data behaviour o Behaviour: [enum] type of data behaviour item takes on o Frequency: [int] frequency at which the behaviour is enacted  Storage o Size: [int] size of storage in terms of megabytes o Access mode: [enum] what type of access to the storage is allowed  Computation o Computational power: [float] computational power represented as clock cycles in gigahertz  Connectivity o Connection type: [enum] type of connection to another item. o Connection speed: [int] speed of connection represented in megabits per second o Connections o Maximum connections: [int] maximum number of concurrent connections allowed to/from item o Connected items: [array] list of other items this item is connected to  Devices o Mobile endpoints: [array] list of mobile endpoint devices belonging to item o Fixed endpoints: [array] list of fixed endpoint devices belonging to item  Data o Data type: [enum] type of data represented o Data size: [int] size of the data file represented in megabytes

The components listed above should create an ECS in which entities can be built by specifying the components they have which will allow them to take on attributes of their counterparts in the DE.

Being able to define attributes for an entity through the use of components gives great flexibility and increased performance to an application where there are continually changing data points and methods. Entities can be built entirely by specifying the constituent components and without needing to worry about the logic that would act on the components.

The actual code that manipulates or accesses entity data in the components can be kept completely separate and maintained in specific systems designed to perform particular procedures on specific component data. This is the last important part of ECS approaches, with each system being its own modular code section with procedures that only touch specific components instead of needing to handle them all.

137

Section 9.3 describes ECS systems and, in particular, details the systems in the DEEv-MoS ECS, describing the purpose of the system, what components it deals with and what actions it performs.

9.3 Systems In the DEEv-MoS ECS a number of systems are required for both the simulation part of the model as well as the predictive part. These systems will run on a continual schedule, executing against all components of particular types in the environment, creating an ever- changing simulation of the DEEv-MoS digital environment.

The main systems that will be made use of in the DEEv-MoS ECS are as follows:

 rendering system,  data system,  location system,  device system,  computation system,  storage system,  connectivity system,  connections system,  data behaviour system, and  events system.

These systems each interact with the correlating component defined above, in section 10.3, with the only extra addition being the events system, which will interact across all components.

Each system must serve a purpose in the ECS. There is no point in having a component type with no system to act on it. Conversely, having a system with no entities using its specified component to act on isn’t desirable but is not bad. In the second case, a new entity could be defined at any time that makes use of the given component and the system will run and perform its job. But in the first case, if this occurred, there would be a component that doesn’t serve any purpose.

The following is a more detailed look at the systems listed above, defining the components they act on, along with a description of their purpose in the DEEv-MoS ECS. The systems are as follows:

 Rendering system o Components: renderable o Purpose: this system is designed to render environment entities in the DEEv- MoS digital environment onto the simulation canvas, rendering each item type in a way that distinguishes it from others.

138

 Data system o Components: data, connectivity o Purpose: this system is responsible for the creation, deletion and duplication of data entities in the system. It takes into account data attributes and connectivity between entities.  Location system o Components: location o Purpose: this system is intended to update entity position data on the environment coordinate plane.  Device system o Components: devices o Purpose: this component is responsible for updating device data for a given entity. This data indicates ownership of a device.  Computation system o Components: computation o Purpose: this system is responsible for calculations based on an entity’s computational power.  Storage system o Components: storage o Purpose: this system is responsible for calculations regarding the storage of an entity.  Connectivity system o Components: connectivity o Purpose: this system is intended to perform calculations on an entity’s connectivity, as well as perform updates.  Connections system o Components: connections o Purpose: this system is responsible for maintaining connections between different entities in the environment.  Data behaviour system o Components: data behaviour o Purpose: this system is responsible for executing entity behaviour in regard to data based on the data behaviour, and maintaining this behaviour.  Events system o Components: renderable, location, data behaviour, storage, computation, connectivity, connections, devices, data o Purpose: this system is intended to introduce changes to the environment and its entities through events that occur in the environment.

Each of the systems above deal with an important aspect of the DEEv-MoS ECS and digital environment, executing to shape the interactions between entities and introducing changes to the environment.

139

Entity component systems as a whole have been shown to be very well suited for applications that need to be flexible, are heavily data oriented and need constant changes made to the data.

Traditionally, ECS approaches have enjoyed popularity for developing games, simulations, microservices, editors and anything that you want to design to be easy to unit test.

9.4 Conclusion Chapter 9 has discussed the DEEv-MoS entity component system in detail, defining the three main parts of it: entities, components and systems, and detailing how these parts fit together in creating an effective application.

The difference between traditional OOP object-driven programming approaches and newer ECS data-driven approaches was explored. It was shown that OOP approaches can become very complex and difficult to maintain in the long run when many substantial changes are required over time. In contracts, the ECS approach allows for creating more modular and maintainable applications which do not deteriorate in structure over time as OOP applications do. This stems from the separation of data and logic in ECS approaches to be dealt with in completely different parts of the application.

Entities were discussed as being the high-level containers that define a given item in an application environment. Unlike classes in OOP, entities do not contain data or methods. Instead, they are simply unique identifiers with which attributes are associated, creating items in the environment. Entities merely exist to group aspects of an item together to define what it is and how it interacts in the environment.

The key part of data-driven approaches is the data itself, with all operations occurring on the data, driving actions and changes in the environment. Components are the parts of the ECS that contain all of the data relating to an aspect of the item, with each aspect being known as a component. Components can contain multiple data attributes for a particular aspect. However, no program logic exists in the components, making them classification labels paired with data structures used to indicate that a given entity has some aspect.

In OOP approaches, the program logic is contained in the class along with the data, giving classes the ability to execute small sections of code against very specific object instances. Classes struggle, however, to execute global logic against everything in the system, which is where ECS systems shine. Systems contain all of the code that performs actions in the environment, and they apply the actions globally to entities with specific component types. Each system normally only deals with one component, executing logic on all instances of that component’s data, running on a continual basis one after the other.

The information presented in this chapter is of relevance to the following secondary research question:

140

SRQ3: Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

Chapter 5 discussed programming paradigms, focussing on defining the difference between programming concepts, paradigms and languages. It was shown that paradigms are shaped by the concepts they use, and that programming languages implement these concepts to solve a problem.

The main concepts that shape a paradigm were listed as:

 independence,  records,  lexically scoped closures, and  named state.

Each of these have very important implications on what is possible using the paradigm they are part of. Records (or data) are a key concept that needs to be carefully considered more and more in modern programming due to the larger availability of and variance in data types.

A popular paradigm in modern programming is that of object-oriented programming, where the emphasis is placed on using abstract definitions for objects called classes. Classes define both the data and the logic of the application, making use of interfaces to hide data from direct access.

OOP approaches make use of inheritance to extend classes for objects with common properties, allowing the base class’s attributes and methods to be used in the child. This appeared to make it easier to create modular code that could easily be maintained and changed as needed, which was what was intended. The opposite actually resulted from the preference for inheritance, with large programs with many object types being created with extremely complex inheritance hierarchies that meant changes in one place would need to cascade to many others. This created bloated and fragile code bases that took a lot of time and effort to alter, and due to code executing individually on each object separately, it would become very inefficient in terms of performance, especially when large numbers of objects were created.

Game developers needed a better way of implementing and maintaining their applications, especially if they wanted to support the game for long periods of time with new features, bug fixes and balance changes. As a result of this need, a shift to a data-driven development approach occurred, a style used by the entity component system paradigm. In the ECS approach, the data and logic are separated from ‘objects’ and kept in their own independent parts of the system. In this way, components could be used to strictly define

141 the attributes of entities in the game, while systems focussed on implementing the execution code that acts on the data.

Because systems act on all entities with a given component, global processing of the data could be performed and separated into different modules. This made the code base easier to maintain and make additions to, without affecting many other parts of the system.

The DE is an immensely complex mass of entities and relationships, which already makes using an OOP approach infeasible. By modelling it through the use of an ECS, individual entities can be represented and interacted with by the systems, reducing the relevance of the number of entities. New entity types can also easily be added by composing them from existing components to give them desired characteristics and behaviours.

SRQ3 is therefore answered adequately; due to the complexity and scale of digital environments, OOP solutions would struggle with performance and become inflexible to future change. An ECS approach overcomes these shortcomings, making it an ideal paradigm for use with digital environments and the DEEv-MoS digital environment in particular.

Chapter 10 will discuss the machine learning engine component of the DEEv-MoS model, detailing the ML models that are implemented, their purpose and what considerations need to be made when implementing it within the model.

142

143

10 DEEv-MoS: predictive modelling engine In the literature review chapters of this thesis, relevant background information was presented on important fields related to the main problem background and areas that are of value to answering the research questions and defining a model. Figure 10.1 highlights the component of the DEEv-MoS model that this chapter will describe.

Figure 10.1 The predictive modelling engine component of the DEEv-MoS model. Chapter 10 is concerned with the DEEv-MoS machine learning engine, defining what it is designed to do and how it achieves its objectives, and also explores the field of ML more thoroughly in the context of predicting changes.

The predictive modelling engine in the DEEv-MoS model will make use of data regarding the entities in the DEEv-MoS digital environment and interact with it through the ECS defined in Chapter 9. The entities that the ML engine will need to take into consideration are the following six, as defined in Chapter 9:

 agents,  data,  network nodes,

144

 network paths,  fixed endpoint devices, and  mobile endpoint devices.

The ML engine will need to make use of available environment data to predict changes with regard to each of the above six entities. It will affect the environment through one of the DEEv-MoS ECS systems, keeping in line with how ECS approaches separate data from logic, where execution logic lives in the systems which then act on the data in the components.

10.1 DEEv-MoS machine learning In the DEEv-MoS model, the various components and entities are used to model the real- world DE as closely as possible. It is important for the DEEv-MoS digital environment to contain the same type of entities as the DE, at least as closely as possible, as in reality there are so many different entities to consider that it would become very complex if all were to be modelled.

Besides the actual entities in the DEEv-MoS digital environment, it is equally important to model their interactions and behaviour. The interactions and behaviour of the entities drive change in the digital environment, so it is important that they are considered. In the DEEv- MoS digital environment, the interactions in the digital environment are driven by the agent entities.

As entities act to fulfil their utility, they create, modify, delete and copy data entities, causing data on network nodes and endpoint devices to change. If enough changes occur on a network node, for instance, and it comes to hold a large amount of valuable information, then more and more agents will want to interact with it, and connections to it from other nodes will become more valuable in allowing various endpoints to connect to it. Network nodes, connections and endpoints cannot, however, be interacted with in an unlimited fashion, as that would be unrealistic.

There are factors that affect the volume and frequency of interaction within the DE. These are usually tied to physical attributes of the DE and the hardware that it is made up of. These same factors need to play a role in the DEEv-MoS digital environment if it is to effectively simulate how the DE changes and evolves over time.

In the real world, the factors that affect the DE and that will be considered in this thesis are:

 hard drive storage size,  computational power,  network data transfer speed,  network size, and  file size.

145

The majority of the above factors, namely hard drive storage size, computational power and network data transfer speed, relate to the physical hardware that the DE runs on. Two of the factors, however, are indirectly related to the physical capabilities of network hardware. These are file size and network size.

Hard drive storage space relates directly to current technology’s ability to pack information more densely onto a single track of a magnetic disk drive. This is also known as the disk areal storage density, with greater density allowing for a much larger amount of data to be stored on the same physical size of disk drive (usually based on a two platter 2.5-inch disk drive).

Computational power relates directly to current technology’s ability to fit more transistors on dense integrated circuits. As advancements in technology are made, more transistors can fit on a chip and more computational power can be made available on the same sized chip. This allows for more powerful computers to be made without taking up large amounts of space.

Network data transfer speed relates to current technology’s ability to transmit information over network cables or via radio waves. The greater the number of bits that can be transmitted in the medium per second, the greater the network speed and ability to transfer larger data sets in a shorter amount of time.

Network size is not directly related to current technology, in the sense that technology does not really place a limit on the number of nodes in a network. This is affected by growth of computer use and online data storage, and the demand for data availability.

File size is also indirectly related to physical hardware, as a file can be created to be as big as possible. However, technology governing the storage, use and transmission of files does impact it. Large files are more a result of the other advancements than a technological advance. However, compression technology has made it possible to store more data than was previously possible in the same sized file.

Each of these factors need to be modelled in the DEEv-MoS digital environment, as they are factors that have greatly contributed to the evolution of the DE in the real world today.

146

Figure 10.2 Planned neural network architecture of extreme learning machine in the predictive modelling engine.

147

To model the phenomena around the development of these factors, mathematical algorithms can be used. However, these tend to predict the futures values on an unrealistic projector that cannot be maintained due to physical technology constraints. Related to the factors above are a number of ‘laws’ that scientists and researchers have created to describe the growth and predicted growth in each field.

The ‘laws’ can be more accurately described as observations, however, as they do not actually govern the growth of the given technology. Examples of these ‘laws’ are:

 Moore’s Law: predicts the growth in the number of transistors that can be place on an integrated circuit. This allows for more powerful computers to be created (Moore, 1965).  Kryder’s Law: predicts the growth in hard disk drive areal density, allowing for greater storage of data (Walter, 2005).  Keck’s Law: predicts the growth in the number of bits per second that can be transmitted down optical fibre. This allows for faster network speeds (Hecht, 2016).  Edholm’s Law: predicts the growth in the bandwidth of telecommunications networks. This allows for faster network speeds (Cherry, 2004).  Metcalfe’s Law: quantifies the value of a telecommunications network based on the number of connected users. This allows for the quantification of network value (Allardice, 2019).

Each of these looked at predicting growth in the underlying technologies every 12-24 months, with growth factors ranging from two-fold to exponential growth. Many of these observations held true for a period of time, but technology fell behind in most instances due to constraints from materials and processes.

The only exception to the above growth predictions is Metcalfe’s Law, which, instead of predicting growth, quantifies the value of a network based on the number of interconnected computers and users.

Using these laws can give a rough indication of the growth of each factor in the DE, but they are too fixed in projection. A more realistic approach would be to make use of ML to predict future growth based on historical data and not a fixed mathematical line projection.

The ML approaches that will be used in the DEEv-MoS predictive modelling engine will be focussed on the particular tasks of predicting the change in the technology factors mentioned above. As each of the factors grows in potential, the DEEv-MoS digital environment is affected by creating different possibilities for interactions and causing agents to behave differently.

NNs have been selected as the ML approach used in the predictive modelling engine due to their inherent ability to learn patterns from data without the need for explicit rules or

148 manual weighting manipulation. NNs are able to optimise weightings to determine features’ importance in predicting outcomes and are well suited for time-dependant series data. This makes them ideal candidates for sequential numeric predictions over a time period as is needed for predicting the future values of the above mentioned factors in the DE.

In Chapter 4, a type of feed-forward neural network, extreme learning machines, were discussed. Research has shown that ELM outperforms traditional SLFNs in terms of training speed and matches if not exceeds their accuracy. This makes ELM an enticing option for use as the ML approach implemented in the predictive modelling engine.

The process of assigning random values to weights and biases in the NN results in less time needed in tuning these parameters by hand. Another advantage of using the ELM learning process instead of back-propagation, is that the resultant model will be less prone to converging on local minima. ELM do however struggle to cope with large datasets and can suffer in terms of accuracy if the activation functions are not infinitely differentiable.

Another approach that could be followed is that of a recursive neural network. This form of NN makes use of inputs from the previous time step along with that of the current time step when calculating results at a node. RNNs are well suited for both univariate and multi- variate predictions, allowing predictions to consider all outputs relative to one another as well as reducing the number of NNs required. This use of memory makes the RNN a good option for time series data, as the RNN is able to learn data trends over time better. RNNs can however become complex and difficult to understand and therefore debug or adjust.

Even though RNN is well suited for time series data, the performance, simplicity and accuracy of ELM make it an intuitive choice as the ML approach for use in the DEEv-MoS model. Figure 10.2 depicts the planned architecture of the ELM approach.

NN approaches in general are not well suited for dealing with dynamically changing environments due to the lack of pre-existing knowledge, training complexity and time, and memory constraints in traditional FFNNs. The introduction of memory in RNNs and the use of training data distribution strategies do alleviate the problem, however are still very ineffective as with most supervised learning approaches in this regard (Pan & Duraisamy, 2018).

This limitation relates to the notion of complex systems, which the DE would fall under. Complex systems have previously superficially been thought of large systems with many components and relationships; this definition is now considered correct for complicated systems not complex systems. However a more accurate view is that of complicated systems that also have second order complexity, where over time the relationships and interaction between subcomponents of the system change (Santos & Zhao, 2017).

The change of relationships and interactions between subcomponents is known as emergence, where emergent properties are unexpected behaviours/outcomes resulting

149 from the second order complexity of the system. These emergent properties can be harmful or beneficial in the system depending on the system structure and operating environment (Johnson, 2006).

Without being able to anticipate emergent properties in a complex system it becomes extremely difficult to predict future outcomes for the system itself or its components with certainty. Many researchers believe that the emergence of a complex system result in behaviours that cannot be identified through functional decomposition; meaning that the system becomes more than simply the sum of its parts (Pascual-Garcia, 2018).

To model emergence in the DE would require substantial further in-depth modelling of all of its components, their relationships, behaviours and intricacies. This falls out of the scope of the DEEv-MoS model and is a definite limitation thereof.

The following subsections, 10.1.1 – 10.1.5, describe the ML approaches that the DEEv-MoS predictive modelling engine will make use of to predict growth for each of the five digital environment factors that this section has outlined.

For each factor, the impact of the growth of the factor will be outlined, along with the ML approach, inputs and expected outputs.

10.1.1 Storage size Storage size in the DEEv-MoS digital environment relates to the hard drive storage space factor above. This determines the maximum amount of data that can be stored on a single drive, which affects different entities in the environment.

Entities affected:

 fixed endpoint devices  mobile endpoint devices  network nodes

Implications of factor growth:

 More data storage available on endpoint devices: agents are able to create and store more data entities on endpoint devices, changing their behaviour in terms of data creation and duplication.  More data storage on network nodes: network nodes are able to store a larger number of data entities, making them ideal online data repositories. As they store more data, their utility and value increases. Increased value of a network node leads to greater connectivity from other nodes and endpoint devices.

150

Machine learning algorithm:

 algorithm: ELM SLFN  inputs: historical maximum disk areal density data  output: predicted maximum disk areal density

Next, the computational power factor is detailed.

10.1.2 Computational power Computational power in the DEEv-MoS digital environment relates to the computational power factor above. This determines the maximum computational power available to computers in the environment, which affects a number of DEEv-MoS entities.

Entities affected:

 fixed endpoint devices  mobile endpoint devices  network nodes

Implications of factor growth:

 More computational power available on endpoint devices: agents are able to perform more computationally intense operations on their devices. This allows for creation and use of larger data entities and changes behaviour for interacting with network nodes.  More computational power on network nodes: network nodes are able to perform more computationally intense operations. This creates the possibility of remote computing for agents, taking the computational needs away from their endpoint devices. Network nodes can also process more data requests and operations concurrently, allowing them to serve a larger number of agents and other network nodes, which increases their value.

Machine learning algorithm:

 algorithm: ELM SLFN  inputs: historical maximum computational power data  output: predicted maximum computational power

The network speed factor is covered in the next section.

10.1.3 Network speed Network speed in the DEEv-MoS digital environment relates to the network data transfer speed factor above. This determines the maximum amount of data that can be transmitted

151 in a second between network nodes and endpoints. This is subdivided into two categories: wired and wireless, and affects a number of entities in the digital environment.

Entities affected:

 fixed endpoint devices  network paths  mobile endpoint devices  network nodes

Implications of factor growth:

 Faster network speeds for network paths: network paths are able to transmit more data over a shorter period of time. This directly affects the time taken for data to be transmitted from one point in a network to another, which has an impact on agent behaviour when attempting to interact with connected entities in the network.  Faster network speeds available to endpoint devices: agents are able to receive data from and send data to other agents and to network nodes more effectively. This changes agents’ behaviour to favour communication paths that are more efficient.  Faster network speeds available to network nodes: network nodes are able to send data to and receive data from other network nodes and endpoints more effectively. This creates added value for the network node, as requests are carried out faster. Greater network speeds create well used backbones for the network and clearly favourable transmission paths.

Machine learning algorithm:

 algorithm: ELM SLFN  inputs: historical maximum network speed data  output: predicted maximum network speed

The factor of file size is discussed in the next chapter, along with the implications of factor growth and the entities affected.

10.1.4 File size File size in the DEEv-MoS digital environment relates to the file size factor above. This determines the maximum size of a single data entity in the environment. This affects a number of entities in the DEEv-MoS digital environment.

Entities affected:

 data  fixed endpoint devices

152

 mobile endpoint devices  network nodes

Implications of factor growth:

 Larger file size for data entities: data entities are able to store more data in a single entity. This increases the value of singular data entities as it reduces the need for multiple files.  Larger file size in endpoint devices: endpoint devices are able to store more data in singular data entities, reducing the number of data entities that need to be kept track of. Larger data entities also mean endpoint devices may use up storage space more easily and not have enough available to accommodate a new data entity.  Larger file size in network nodes: network nodes are able to store more data in singular data entities, reducing the number of data entities that need to be kept track of. Larger data entities result in longer data transmission times to other nodes and endpoint devices. Larger files may also affect agents’ behaviour with regard to accessing data or duplicating data to a device.

Machine learning algorithm:

 algorithm: ELM SLFN  inputs: historical maximum file size data  output: predicted maximum file size

The next factor that is considered is the size of networks in the environment.

10.1.5 Network size Network size in the DEEv-MoS digital environment relates to the network size factor above. This determines the maximum size of a network in the environment i.e. how many interconnected devices exist. This affects a number of entities in the DEEv-MoS digital environment.

Entities affected:

 agents  fixed endpoint devices  mobile endpoint devices  network nodes

153

Implications of factor growth:

 Larger networks for agents: larger networks allow agents access to a larger number of data sources and communication channels. This affects agent behaviour regarding data storage, access and communication.  Larger networks for endpoint devices: endpoint devices are able to interact with more devices, increasing the value of the device to the agent. This makes for behavioural changes in the agent in terms of communication and the use of specific devices.  Larger networks for network nodes: network nodes are connected to a larger number of other network nodes and endpoint devices. This increases the value of the network by increasing the data sources and communication channels. The value of individual network nodes decreases with network size as more options for data sources become available to agents.

Machine learning algorithm:

 algorithm: ELM SLFN  inputs: historical maximum network size data  output: predicted maximum network size

The combination of the growth of the above factors over time in the DEEv-MoS digital environment and the interactions from agents in the environment contributes greatly to the evolution of the environment as a whole, driving different behaviours and facilitating more interactions.

10.2 Conclusion Chapter 10 has discussed the DEEv-MoS predictive modelling engine in detail, discussing neural networks and how they operate, and defining the main factors that need to be considered in a changing environment, each of which had an ML algorithm approach with it.

In Chapter 4, neural networks were shown to be an attempt by scientists to replicate the function and structure of the human brain in computer systems. This was achieved by creating networks of artificial neurons that mimic the structure of the human brain and can be used to perform calculations and logical checks.

Two types of artificial neurons were discussed: perceptrons and sigmoid neurons, with the difference between them being detailed along with their use. It was shown that a network of sigmoid neurons could be used to simulate learning in a computer based on inputs, weights attached to the inputs, a bias for the neuron firing, and a sigmoid function that gave

154 a resultant output. Multiple layers of neurons could be chained together to solve more complex problems.

The main five factors that are deemed influential in the evolution of the DE were listed, describing each factor along with previous ‘laws’ that had been used to predict future growth. The existing ‘laws’ were found to be inaccurate and only made use of mathematical line functions to determine future values, many of which are already disproved. The five factors are:

 hard drive storage size,  computational power,  network data transfer speed,  network size, and  file size.

The DEEv-MoS predictive modelling engine was defined with regard to each of the five factors, stating the DEEv-MoS digital environment entities that they affect, what implications they have for the entities and environment, along with which machine learning approach will be used. The inputs, model and output for each factor were listed.

Extreme learning machines, which were introduced in Chapter 4, were compared with recursive neural networks, highlighting the strengths of each approach. Even though both approaches have their merits, ELM was chosen for use in the predictive modelling engine due to its simplicity and performance.

The information presented in this chapter is of relevance to the following secondary research questions:

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ4 is addressed by the nature of the DEEv-MoS digital environment and the relationships between the components. Agents, making use of their utility and available actions, behave in a manner that affects the entire digital environment. The machine learning engine indirectly facilitates changes by affecting agent behaviour and available actions and interactions.

SRQ5: Can ML, in particular extreme learning machines, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

With ML model used on each of the main five factors that influence the growth and change of digital environments, it is possible to predict the change of the digital environment as a whole. Each factor has its evolution predicted using advanced machine learning algorithms

155 that should outperform commonly referenced ‘laws’. The use of ELM would benefit the predictive modelling engine in terms of performance and simplicity. This addresses SRQ5 sufficiently as the evolution of each individual factor, along with the corresponding entities, results in the evolution of the environment as a whole.

Chapter 11 will take a closer look at the constraints engine component of the DEEv-MoS model, defining its purpose in the model and its interaction with other components, as well as the different types of constraints it defines.

156

157

11 DEEv-MoS: constraints engine In Part Three of this thesis, the DEEv-MoS model is covered, beginning with a high-level overview that introduces the model and its intended application and function. Chapter 7 describes the various components of the model and how they fit together in creating the DEEv-MoS model.

In Chapters 8 through 10 each of the main components is described in more detail, with Chapter 9 focusing on the DEEv-MoS digital environment, Chapter 9 detailing the DEEv-MoS entity component system, and Chapter 10 describing the DEEv-MoS predictive modelling engine. Figure 11.1 highlights the component of the DEEv-MoS model that will be covered.

Figure 11.1 The constraints engine component of the DEEv-MoS model. These three components make up the bulk of the important functions of the model, describing what the digital environment is, what entities exist, what metrics describe them, and how they will be used to predict the evolution of the digital environment. It is, however, important to have smaller components that are focused on tasks related to enabling the function of the other components.

158

The DEEv-MoS constraints engine is one of the two smaller components in the DEEv-MoS model, whose main goal is to facilitate the effective functioning of the three main components. Although it does perform important functions which help shape the model, these are in support of actions carried out by one of the three main components.

It may then seem unintuitive for this component to even exist if it only exists to support the main components; why not integrate its function into the other components instead? The reason for this is to create modularity within the DEEv-MoS model, allowing each component to be created independently of the others and to be maintained in a simpler and more effective manner.

If the function of the constraints engine were to be split among the other three components, each of those components would become more complicated and would need to take into account inputs or outputs that are only partially relevant to it.

By splitting functions that touch on each of the other components into their own small components, it becomes easy to enact functional changes in how components interact from a single module.

A valid question at this point may be to ask what role the constraints engine plays in the DEEv- MoS model, and why it touches on multiple other components. The following section, Section 11.1, details the purpose of the constraints engine in relation to the DEEv-MoS model as a whole and in relation to individual components as well.

11.1 Function of the constraints engine In the creation of a model that predicts and simulates changes in an environment, such as with DEEv-MoS, there are many moving parts and values that need to be kept track of. For a simulation to run effectively, it is important for there to be a range of values used for particular metrics or entities, so that they function in accordance with what is set out in the environment.

The DEEv-MoS constraints engine is the component of the model that is responsible for keeping track of these value ranges, placing constraints on what values can and cannot be used. The constraints engine updates these values throughout the running of a simulation, making use of information from the predictive modelling engine to recalculate the constraints at that point in time. It then updates the digital environment component and events engine component (which is covered in Chapter 12) with the constraints, forcing them to operate within a reasonable range of what is occurring in the simulation.

The constraints engine interfaces in some manner with all of the other components in the model, each of which will be described in the following sub-sections, 11.1.1 to 11.1.4.

Figure 11.2 depicts the internal structure of the constraints engine along with the data between it and the other components of the DEEv-MoS model, represented in UML with a

159 component diagram. The first component to be discussed will be the predictive modelling engine.

Figure 11.2 Component diagram of the constraints engine and data flow with the other DEEv-MoS components. 11.1.1 Predictive modelling engine interaction The predictive modelling engine was detailed in Chapter 10, describing the ML algorithms that will be used and to what parts of the digital environment they would be applied. This is the component that drives the evolution of the digital environment, predicting how entities in it will change throughout the simulation run.

It is important that when predictions around entities are made, the predicted changes are affected in the simulation, and thus affected in the digital environment. This is where the constraints engine comes into play. It makes use of updated prediction data regarding entities in the digital environment from the predicting modelling engine, and then calculates new constraints for the value ranges for the given metrics and entities.

The predictive modelling engine’s results are used as inputs for the constraints computations in the constraints engine.

However, after constraints are calculated they need to be applied and this is where the constraints engine’s interaction with the digital environment comes into play.

11.1.2 Digital environment interaction The digital environment, as defined in Chapter 8, describes all of the entities in the simulation environment, how they are related to one another, and what actions they are capable of. It is ultimately the part of the model that mimics the DE, and through simulation describes how it evolves with regard to the many entities.

An important part of modelling a digital environment is being able to define constraints on the number of different types of entities and their relationships, forcing the change in the environment to be incremental and not sudden.

160

The constraints engine places constraints on the digital environment based on its calculations that use the predictive modelling engine’s results as inputs. This creates control over the speed and manner in which changes occur in the digital environment, which can be adjusted by making alterations to only the constraint engine’s calculations, rather than needing to make adjustments to all components.

The digital environment component houses the various entities and describes the simulated digital environment in terms of size and interactions. However, changes also need to occur on a more granular scale. The entities in the digital environment also need to change along with the digital environment itself.

The constraints engine must therefore be able to interact with the entity component system as well.

11.1.3 Entity component system interaction The entity component system component of the DEEv-MoS model is responsible for representing the digital environment in terms of its entities, what they are capable of and how they interact. Chapter 9 describes how the ECS is used to model various entities, as well as key metrics regarding them.

As each entity evolves, it is important, once again, for it to do so within the bounds of a set of constraints so as to accurately model evolution. Without the constraints, entities could wildly differ from one moment to another.

The constraints engine calculates constraints for the entities and their internal metrics, creating a more controlled and realistic simulation. This will also ultimately affect the entities’ behaviour and interactions, leading to a more desirable state.

The last component that is interacted with by the constraints engine is the events engine, which, in itself, has not yet been described in more detail than the high-level description in Chapter 7.

11.1.4 Events engine interaction The events engine, as stated in Chapter 7, is the component that is responsible for generating events in the digital environment that affect the speed and direction of change. It is responsible for introducing catalysts for change within the environment and also preventing stagnation.

As is the common thread with all the components mentioned in this section, the events generated by the events engine cannot fall outside of a set of constraints. The constraints for these events may be calculated in a different manner to the constraints on other components, enforcing less strict constraints for the events; the events must, however, still abide by the constraints.

161

This allows for events to be generated that can still have a significant impact, but without them driving the direction of the digital environment’s evolution too drastically.

Without constraints being applied to the different components of the DEEv-MoS model, it could quite possibly lead to simulations and predictions that are wildly inaccurate and therefore of no use.

The constraints applied by this component allow for changes to occur within the bounds of a defined rule set, while still allowing enough variance to prevent fully deterministic outcomes.

Constraints are applied to each of the components in the model; however, only certain entities and values are actually affected by the constraints. Section 11.2 will list the entities and metrics that have constraints applied to them, grouping them by the component they are related to.

11.2 DEEv-MoS constraints Each component of the DEEv-MoS model that interacts with the constraints engine will have parts of it that have constraints placed on them. These can range from high-level numeric values to specific metrics regarding a particular entity.

The following sub-sections will list each of the entities or metrics that will have constraints applied to them, briefly describing the purpose of the constraints and their affect. For the sake of organisation, the constraints will be grouped by the relevant components.

11.2.1 Digital environment constraints The constraints that affect the digital environment component are high-level values that describe the environment in terms of quantity of entities and interactions.

The constraints are:

 Agent population: this constraint sets a range in which the population of agents (representing users) can exist. Constraining this metric keeps control over the population modelling aspect of the model, allowing for more realistic simulation in population growth and, indirectly, interactions (as more agents ultimately means more interactions).  Network size: this constraint controls how large a network can be in terms of numbers of network nodes. Constraining this metric allows for realistic simulation of network growth and drives usage behaviours from agents.  Data size: this constraint determines how large a single piece of data can be in the digital environment. By constraining it, more realistic usage behaviour and interactions between network nodes can be simulated.

The digital environment constraints keep the environment size in check and dictate the global rate of growth and change.

162

11.2.2 Entity component system constraints The ECS component deals with the specifics regarding the entities of the digital environment, as well as what properties they have and how they can interact. Constraints on the entities are generally values representing particular traits that they have from their constituent components.

The ECS constraints are as follows:

 Agent age: this constraint determines the age range of agents, which helps control the population evolution and change. Constraining this controls how long agents can exist for, creating dynamism in the population.  Device computational power: this constraint determines how powerful devices can be computationally. This constraint affects the interactions between different devices and agents.  Device storage size: this constraint determines the maximum amount of data that can be stored on the device. This affects the data behaviour of agents and network nodes.  Device connectivity: this constraint determines the number of concurrent connections a device can have to other devices. This also affects agent and device data behaviour, as not all connections to a device can be served at the same time.  Network path speed: this constraint determines the bandwidth of network paths in the environment, which ultimately determines the speed of data actions. This constraint affects how quickly actions and interactions can take place.  Network path distance: this constraint determines how far apart devices can be in terms of physical distance. This constraint shapes the layout and structure of networks by only allowing connections to be made within the constrained distance.

Constraints on the ECS apply a finer level of control over the structure, interactions and behaviours of the digital environment.

11.2.3 Events engine constraints Events generated by the events engine can, if unchecked, cause large fluctuations in environment size and function. The constraints associated with the events engine control how radical the changes can be, and define a range of acceptable values for deltas. Deltas are calculated to compare the difference between two numbers and in the constraints engine apply to how large (or small) the range of constraint values can be (Deziel, 2018).

The purpose of events that are generated is to alter the values associated with the constraints for the components mentioned above. However, instead of constraining the end value, the magnitude of the change is constrained.

Therefore, the events engine constraints are deltas in regard to the digital environment constraints and ECS constraints. These are:

163

 Agent population: this constraint determines how radical a change in agent population can be caused by an event.  Network size: this constraint controls the magnitude of changes to network size from an event.  Data size: this constraint determines how much the size of data entities can increase.  Agent age: this constraint determines the change in age range of agents, which can have major implications on the population.  Device computational power: this constraint determines how much more powerful devices can be made.  Device storage size: this constraint determines the increase in the maximum amount of data that can be stored on devices.  Device connectivity: this constraint determines the change in the number of concurrent connections a device can have to other devices.  Network path speed: this constraint determines the bandwidth increases of network paths in the environment.  Network path distance: this constraint determines the change in how far apart devices can be in terms of physical distance.

The above event constraints were chosen due to the impact that each would have on the change in the DE. By altering these constraints, change in the environment can be accelerated or slowed down, as they affect the possible actions and behaviours that agents and other entities have available to them.

Events constraints need to be carefully controlled, otherwise singular events or a series of events can drastically change the digital environment rapidly.

11.3 Conclusion Chapter 11 has detailed the DEEv-MoS constraints engine component, defining the function of the components, which components it interacts with and impacts, and what constraints need to be applied to the entities and metrics of these components.

Firstly, the need for constraints in a simulation environment was promoted, explaining how, left without checks, an environment could very quickly grow and change in an unrealistic manner.

Thereafter, the function of the constraints engine in the DEEv-MoS model was described. It was shown that the constraints engine interacts with all components of the DEEv-MoS model, placing constraints on them relative to their function, while taking in input from the predictive modelling engine to calculate the constraints.

Lastly, the different constraints that are enacted on each component were listed. Each constraint was defined, along with the entity or metric it affected and what implications it has on the digital environment.

164

The information presented in this chapter is of relevance to the following secondary research question:

SRQ6: How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

The constraints engine makes use of results from the predictive modelling engine to calculate the constraints it places on the various components of the model. Chapter 9 highlighted the ML capabilities of the predictive modelling engine which is used to understand what the different parts of the digital environment could look like in the future.

It is, however, the constraints engine that actualises the changes to the entities and adjusts what is and isn’t possible. It works hand in hand with the predictive modelling engine to evolve the various entities of the digital environment, each in their own way.

This clearly contributes to how AI algorithms can be used to evolve entities without making use of a purely immunological computation approach, making it much better suited for non- natural entities such as those in the DEEv-MoS model. SRQ6 is therefore sufficiently addressed by this chapter.

Chapter 12 is the final chapter of Part Three of this thesis, and is directed at the last component of the DEEv-MoS model: the events engine.

165

166

12 DEEv-MoS: events engine Thus far in Part Three of this thesis, the DEEv-MoS model has been outlined in a high-level overview in Chapter 7, with Chapters 8 to 10 describing the three main components of the model: the DEEv-MoS digital environment, the DEEv-MoS entity component system, and the DEEv-MoS predictive modelling engine. Figure 12.1 highlights the component of the DEEv- MoS model that is covered in this chapter.

Figure 12.1 The events engine component of the DEEv-MoS model. Chapter 11 detailed the constraints engine, one of two components that serve to support the other three main components and facilitate change across the digital environment. The constraints engine was defined as being responsible for the calculation and enactment of constraints upon the other components of the DEEv-MoS model.

The constraints engine prevents changes in the environment and individual entities from being made in too extreme a fashion. Gradual incremental changes over a period of time are a more realistic representation of the growth of the DE. If changes occur too rapidly or are of too great a magnitude, the predicted state of the digital environment will depart from expectation.

167

However, as was shown in Chapter 6 when discussing the problem background in greater depth, in some areas that constitute the digital environment, periods of rapid or large change have occurred historically in the formation of the digital environment as it is today.

Looking at population growth as an example, there were long periods spanning centuries during which the growth of the human population was slow but steady. Events occurred in history that led to sharp declines or increases in population over relatively short time periods in comparison to the previous steady growth.

Short duration events that had a large impact on the human population include the likes of the Black Death pandemic, which killed 200 million people throughout Europe in the span of around a decade (Black, 2019). Other periods saw accelerated population growth, usually characterised by peace, while the advent of the great wars always resulted in a large decline in population size and growth rates (Roser, Ritchie & Ortiz-Ospina, 2019).

It is apparent that to successfully model the DE, events need to be taken into consideration and need to be introduced over varying periods of time and with impacts ranging from small to large.

Section 12.1 details the role of the events engine in the DEEv-MoS model, highlighting its importance and the affect it has.

12.1 Role of the constraints engine As discussed in the first part of this chapter, events have occurred throughout history that have shaped the current DE. These events can affect various parts of the DE, facilitating change in different ways.

It is immediately obvious that even though the general impact of changes in an environment should be small and incremental, there are events that can occur which have much larger footprints.

The event engine component of the DEEv-MoS model is responsible for the creation of events that impact the evolution of the digital environment and its entities, affecting the speed of change, the magnitude of change and the direction in which change is applied.

The events engine is responsible for introducing various types of events that drive specific changes within a particular area of the digital environment. The events that the events engine introduces to the digital environment can be of varying size and duration, and are used to facilitate actual change in the environment and its entities.

The events engine directly introduces events to the DEEv-MoS digital environment, thus driving change in the environment’s entities, interactions and structure. It is the vehicle which drives changes in the environment based on the input from the predictive modelling engine.

168

The predictive modelling engine makes use of ML to predict what the digital environment and its entities will evolve to be. However, it merely outputs a prediction based on values and cannot affect the changes itself. The events engine carries out the changes to the environment by introducing events of different types and sizes, realising the predictions of the predictive modelling engine.

But how does the events engine determine what size of event to introduce to the digital environment? Chapter 11 detailed the constraints engine and its role in the DEEv-MoS model and which other components it interacts with.

The constraints engine makes use of the results of the predictive modelling engine to calculate constraints to be placed on the rest of the model’s components, including the events engine. The constraints engine dictates to the events engine the acceptable ranges for values regarding the magnitude, duration and direction of an event that can be introduced to the digital environment.

It is then the function of the events engine to introduce events that fall within the constraints set by the constraints engine to affect the changes predicted by the predictive modelling engine. Figure 12.2 depicts the structure of the events engine along with its interaction with the digital environment and constraints engine components of the DEEv- MoS model, represented in a UML component diagram.

Figure 12.2 Component diagram of the events engine component of the DEEv-MoS model. It is important to understand the types of events that the events engine may introduce to the environment and what impact they may have. Events in the DEEv-MoS model are categorised into three categories that affect different entities in the digital environment. The three categories are:

 technological events,  population events, and  social events.

169

Each of the event categories deals with a subset of entities in the digital environment and is intended to mimic the impact of its real-world counterpart. Sections 12.2 to 12.4 describe each category of event more specifically, defining the entities that can be affected and what resultant impact they have on the digital environment.

12.2 Population events The first type of event is the population event, which, as can be inferred from the name, affects the population of agents in the DEEv-MoS digital environment. Population events are extremely important in controlling the population size and growth in the digital environment to accurately reflect what would happen in the real world.

Population events impact the following entities in the DEEv-MoS digital environment:

 Agents: population events directly affect the population size and growth rate of the digital environment. The characteristics of population events that affect agents are as follows: o Duration: over what period of time is the agent population affected? o Magnitude: how large is the change in the agent population? o Direction: in which direction is the agent population affected: positively (agent population growth) or negatively (agent population decline)?

Agents drive interaction and use of the digital environment, contributing data and consuming data from various sources. A change in population changes the number of interactions and the volume of data generated and duplicated.

12.3 Technological events As the name suggests, technological events are events that affect technology-oriented entities within the DEEv-MoS digital environment, ultimately shaping what is possible with the technology in the environment, affecting how agents interact.

Technological events impact the following entities in the DEEv-MoS digital environment:

 Data entities: technological events directly affect the size of data entities in the digital environment. The characteristics of technological events that affect data entities are as follows: o Duration: the time period over which the data entity is affected? o Magnitude: how large is the change in the data entity’s maximum size? o Direction: in which direction is the data size affected? Is it affected positively (larger data entities) or negatively (smaller data entities)?  Network nodes: technological events directly affect the size, computational power and connectivity of network nodes in the digital environment. The characteristics of technological events that affect network nodes are as follows: o Duration: over what period of time is the network node affected?

170

o Feature: which feature of the network node is affected: storage size, computational power or connectivity? o Magnitude: how large is the change in the network node’s feature? o Direction: in which direction is the feature affected? Is it affected positively or negatively?  Network paths: technological events directly affect the bandwidth and maximum distance of network paths in the digital environment. The characteristics of technological events that affect network paths are as follows: o Duration: over what period of time is the network path affected? o Feature: which feature of the network path is affected: bandwidth or distance? o Magnitude: how large is the change in the network path’s feature? o Direction: in which direction is the feature affected? Is it affected positively or negatively?  Endpoint devices: technological events directly affect the size, type and computational power of endpoint devices in the digital environment. The characteristics of technological events that affect endpoint devices are as follows: o Duration: over what period of time is the endpoint device affected? o Feature: which feature of the endpoint device is affected: type (fixed or mobile), storage size or computational power? o Magnitude: how large is the change in the endpoint device’s feature? o Direction: in which direction is the feature affected? Is it affected positively or negatively?

With technological entities and their advancement, the general trend is a positive one, with continual incremental growth occurring over time. Very rarely does technology take a step backwards as a whole. However, this may happen regarding a single feature of the affected entity, for example the change in storage size of a device that makes use of a solid state drive versus a hard disk drive.

12.4 Social events Social events affect the manner in which agents in the digital environment interact, dictating behaviour and utility for the agents. In the DEEv-MoS model only agent entities have social behaviour, as the other entities are technological devices that facilitate interaction but do not directly carry it out.

Social events impact the following entities in the DEEv-MoS digital environment:

 Agents: social events directly affect the social behaviour of agents in the digital environment, this takes the form of interaction frequency, locality and social circle size. The characteristics of social events that affect agents are as follows: o Duration: over what period of time is the agent’s social behaviour affected?

171

o Social aspect: which aspect of the agent’s social behaviour is affected: interaction frequency, interaction locality or social circle size? o Magnitude: how large is the change in the agent’s social aspect? o Direction: in which direction is the agent’s social aspect affected? Is it affected positively or negatively?

Social events affect the behaviour of agents in the digital environment, driving their need to interact, how many agents they interact with, and what restrictions are placed on the locality of other agents that they interact with.

12.5 Conclusion Chapter 12 has taken a closer look at the events engine component of the DEEv-MoS model, defining what its function is in the context of the model as a whole, which entities it affects, and how it interacts with other components of the model.

The events engine generates events in the digital environment to enact changes to the various entities based on predicted changes and constraints passed to it by the predictive modelling engine and constraints engine, respectively.

Firstly, events and their impact on the DE were discussed, showing that even though change is often gradual and incremental, events with rapid effect and large impact can occur.

The events engine was then discussed in regard to its function and how it interacts with the other components of the DEEv-MoS model. It was shown that it directly impacts the digital environment component by carrying out events that realise predictions.

The three different types of events that the events engine can introduce to the digital environment were defined, describing which entities each event type affects, how the entities can be affected and what impact the changes have on the environment.

The information presented in this chapter is of relevance to the following secondary research question:

SRQ6: How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

The predictive modelling engine is the component responsible for making use of ML algorithms to predict the changes that will occur in the digital environment. However, as pointed out in Chapter 10, it is only responsible for determining the future values based on its predictions; it does not carry out the tasks of enacting the changes on the environment.

The events engine fulfils this function, making the predictions a reality in the digital environment by introducing events that force changes to occur. It is therefore clear that the

172 events engine plays an important part in evolution of entities and the digital environment itself, thus addressing SRQ6.

This chapter concludes Part Three of the thesis, which is focused on the DEEv-MoS model. Chapter 13 begins Part Four of the thesis by detailing the implementation of a prototype of the DEEv-MoS model that has been defined in chapters 7 to 12. It will discuss implementation specifics around the entities, intelligent agent behaviour and actions, machine learning algorithms and the underlying entity component system.

173

174

13 DEEv-MoS implementation: DEEP Part Three of the thesis detailed the DEEv-MoS model, which was designed to address the primary research question proposed in Chapter 1:

Primary Research Question

RQ 1: Can an MAS, making use of machine learning and an entity component system design, effectively and accurately predict the evolution of a digital environment so as to provide assistance in understanding what the future of a digital landscape will look like?

The primary research question was formulated with the intent of providing a solution to the problem described in sections 1.1 to 1.3 of Chapter 1, which at a high level can be expressed as: “How can the future evolution of the Digital Environment be predicted?”

The purpose of this thesis is to document and detail the process of researching the problem domain and relevant fields that are applicable to creating a solution to the problem. In doing so, a novel model is proposed that addresses the various components of the primary research question, known as the secondary research questions, each of which are concerned with a particular aspect of the problem and solution:

Secondary Research Questions

SRQ 1 What is the current understanding of what the Digital Environment is, and what does it encompass?

SRQ 2 Can an MAS be used to simulate the various components of a digital environment?

SRQ 3 Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

SRQ 4 Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ 5 Can ML, in particular extreme learning machines, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

SRQ 6 How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

175

Chapters 7, 8, 9, 10, 11 and 12 are dedicated to the DEEv-MoS model, describing the model on a high level and defining its various components in Chapter 7, while chapters 8 to 12 focus on each individual component, detailing their purpose, design and functioning.

The DEEv-MoS model was designed to allow researchers to understand how the DE might evolve over time and gain insight into what impact this could have on the modern world and human society in both the short term and long term. In doing so, potential future problems could be identified at an early enough stage to give people the opportunity to prevent them from occurring altogether, or at the least mitigate their impact.

Defining a model in a theoretical manner is of value in exploring the problem domain and creating solutions to various parts of the problem concerned. However, it is of even greater value if a functional realisation of the model can be created so as to demonstrate the possible real-world applicability of the model and determine its effectiveness.

In Chapter 1, the research methodology applied in this thesis was defined at a high level as being a mixed methods approach, where multiple research methodologies are applied to the research. Each of the different approaches deals with a separate area of the problem domain, defining a clear methodology for addressing the given area.

A qualitative research methodology is used in exploring the problem domain along with the relevant fields to solving the problem. This takes the form of a literature study which details background information and describes approaches that have bearing on the solution.

A design science methodology is used for identifying the research questions that need to be addressed in order to solve the problem, and for defining a model which serves as the proposed solution. A proof of concept prototype of the defined model is required as part of the design science methodology, in which the practical application of the model is demonstrated and the results from testing it are recorded.

The results obtained from testing the proof of concept prototype are then used to determine the effectiveness of the model in solving the stated problem. These results can be used to gauge the prototype’s effectiveness and compare different configurations of the prototype in order to determine which are best to use.

The implementation of a prototype system is therefore an important part of the research presented in this thesis, as it provides a concrete real-world example of the proposed DEEv- MoS model, which can then be used to determine the model’s effectiveness.

Chapter 13 details the implementation of the Digital Environment Evolution Predictor, or DEEP, prototype system of the DEEv-MoS model that is presented in Part Two of this thesis. The objective of this chapter is to provide details on how the prototype system is implemented and what consideration had to be made, and to describe how the theoretical DEEv-MoS model can be realised in a working system.

176

The remainder of this chapter proceeds as follows:

Section 13.1 considers the development platform used for the creation of the prototype system, motivating the choice and describing the platform’s capabilities.

Section 13.2 details the implementation of the DEEv-MoS model in a proof of concept prototype system, DEEP, with each subsection focussing on a different aspect of the model and its implementation.

Section 13.3 concludes the chapter, summarising the information presented in the chapter and explaining the relevance thereof to each of the research questions.

13.1 Development platform When developing any system that is aimed at solving a particular problem, it is important to choose the right tools that complement the planned solution and therefore lead to the creation of an effective system.

In Chapter 5, programming paradigms in software engineering were covered, defining what programming paradigms are, how they relate to programming concepts and programming languages, and how important it is to select the correct paradigm when solving a particular type of problem.

Programming languages are the realisation of one or more programming paradigms, which, in turn, are formed through the combination of multiple programming concepts. Concepts are function capabilities that are required in programmatically solving a problem and can include areas such as memory, iteration, structure and many more.

The difference between various programming paradigms can be as simple as the addition or subtraction of one concept, which impacts the capabilities of the programming paradigm as a whole. An effective paradigm combines concepts that complement each other and that can be realised through a programming language to implement a system. The impact of selecting between two different paradigms in solving a problem can range from writing simple-to-manage code to not being able to solve the problem at all.

Based on the concepts that a paradigm encapsulates, the given paradigm may be well suited to solving problems of a particular type or may do so in a manner that fits with the real- world constraints of the problem environment and the system’s intended use.

The entity component system paradigm was described in section 5.5, where it was defined as being a data-driven paradigm. In contrast, the object-oriented-programming paradigm presented in 5.4, was defined as being class-driven. OOP approaches are widely used in the modern world for creating systems to solve business and research problems, and made the object-oriented manner of breaking down a problem popular.

177

ECS approaches were designed with data being the focal point, letting it dictate how the system functions. It was found that the data-driven approach was ideal for game development, as it allowed for large complex games to be built where effort needed to be put into the data defining the game’s entities to create the game (Caini, 2019). Game entities could easily be built from existing components and are ultimately defined as a set of data that particular systems make use of. This increased the efficiency and effectiveness of the game’s systems and allowed for much larger and more complex games to be built without a large overhead of time (Buttfield-Addison et al., 2019).

In modern game development, many companies make use of existing game engines which allow for the easy creation of game entities and behaviours through pre-existing tools. This has dramatically reduced the time to market for game development and democratised the ability to create games. Making use of existing game engines, individuals or small teams can now develop high quality games, which was once only possible with large teams and large budgets (Nystrom, 2014).

Many of these existing game engines implement the ECS paradigm, making it easy for developers to create game objects using only data. One such well-known game engine is Unity, and it is the development platform that is used for creating the DEEv-MoS prototype system.

The next section will describe the Unity engine and its functionality, and will motivate why it was chosen as the development platform for the DEEP prototype system.

13.2 the Unity game engine Unity is a game engine that was created by Unity Technologies in 2005 and is aimed at cross-platform game development. The game engine is written in C++ and includes a C# scripting API allowing users to build games in C#.

It was initially released as a Mac OS X exclusive engine that would allow Mac users to quickly develop games without needing to build everything from scratch.

For a period of several years, after the launch of Apple Inc.’s iPhone, Unity was the only game engine in use for iOS and contributed thousands of games to the App Store. In 2010 Unity 3.0 was released and included support for multiple platforms that it hadn’t before, and an enhanced feature set making console and desktop quality graphics possible.

Over the years, Unity has incrementally been improved, optimising its performance, improving its ease of use, and introducing more features to simplify game development. The new features that were included in each release gave game developers industry standard tools that allowed them to create games with state-of-the-art graphics and sound.

178

Later releases of the game engine have included improvements to its ECS to optimise game performance, and the addition of components focussed on popular real-world applications, such as virtual reality (VR) and machine learning.

Unity Machine Learning Agents is a tool integrated into the engine which allows users to connect to ML platforms to develop intelligent AI behaviour through training. It can connect to platforms such as TensorFlow, and has been used to develop self-driving cars, intelligent competitive game AI and robots, to name a few (Unity Technologies, 2019).

Figure 13.1 depicts an example learning environment in the Unity ML-Agents toolkit, where four main components can be seen:

 Agent: agents that will make use of ML to interact in their environments.  Brain: defines specific states and actions spaces for its linked agents.  Academy: defines the scope of the environment in terms of training configuration, decision making intervals and global run time.  Python API: allows for the use of external Python ML libraries and scripts.

Figure 13.1 Example configuration of a learning environment within Unity ML-Agents (Juliani, 2017). Unity ML-Agents makes it easy to train agents using an array of existing ML techniques such as reinforcement learning, imitation learning and more, as well as the ability to import ML models through the Python API.

Unity has been used in the above industries for creating 3D models which can be used in animation, product manufacturing, ML training of robots, engineering of buildings and other structures, car manufacturing, simulations of events or conditions on objects, etc.

179

Figure 13.2 The Unity game development environment. Unity is a versatile tool, which was created as a game development engine but enjoys use in multiple industries in the modern world (Nystrom, 2014). The development interface can be seen in Figure 13.2. The following subsection will detail the motivation for using Unity for the development of the DEEv-MoS prototype system.

13.2.1 Motivation for using Unity Unity, as shown above, is an advanced tool ideal for game development and many other applications. It provides flexibility in design, speed of development, ease of development, access to existing assets, and integration to industry-accepted tools for ML and VR, all of which make it a very appealing option for the development of a simulation-based system.

It is, however, important to motivate the exact benefits that the platform provides that will complement the development of the DEEP prototype system, as it is easy to get caught up in the platform’s appeal and power.

In this subsection, the four main motivating factors for using the Unity platform are presented, detailing each factor and why they are relevant. The four factors are:

 graphics rendering,  graphical user interface,  ECS architecture, and  ML integration.

Each of the above four factors are described below, starting with graphics rendering.

180

13.2.1.1 Graphics rendering When implementing a prototype of a simulation-based model, it is often beneficial, if not necessary, to be able to visualise the simulation. In the case of the DEEv-MoS model, the DEEv-MoS digital environment needs to be simulated along with the different entities that it contains.

The value in visualising the simulation comes from the ability to watch how the environment changes and how entities interact in a manner that doesn’t require the meticulous study of raw data. Visualising the simulation allows users to gain intuitive understanding of how the environment is evolving and how entities are interacting with one another.

The user can then fairly easily judge whether the simulation is running as intended and whether it is realistic to any degree. Tracking and comparing data around the simulation is obviously important and must be used to quantify the results. However, visual aid makes it easier to get the prototype to an acceptable state of function.

Unity provides an easy-to-use editor for creating an environment and entities, allowing users to do so solely through the use of its graphical user interface (GUI). This will speed up the development of the prototype system, as the creation of the visualisation does not need to be done by hand.

Creating graphical objects, rendering them correctly and maintaining the code related to this is also a big task. When the focus of the model is on predicting the evolution of the digital environment, it doesn’t make sense to spend time creating a graphical rendering system from scratch. Doing so would create additional time constraints, complexities in the prototype system, and increase the possibility of bugs and other issues.

The benefits of using Unity for graphical rendering are as follows:

1. less time is required to implement a visualisation, and 2. there is a reduced risk of bugs and other issues that may arise from custom-built graphical rendering, such as performance and memory use.

Unity is therefore an excellent option for developing a graphical visualisation of the prototype system without needing to worry about the overheads of implementing it by hand.

13.2.1.2 Graphical user interface Unity provides an easy-to-use GUI builder framework, drawing on the experience and expertise of professionals in the industry. It makes it easy to build a GUI for a Unity project and to customise it as needed. In doing so, it saves the program designer from having to spend time building the GUI code and components, and reduces the introduction of bugs to the project as a whole.

181

The benefits of using Unity for GUIs are as follows:

1. less time is required to implement GUIs for user interaction; and 2. there is a reduced risk of bugs and other issues that may arise from custom-built GUI, such as performance and memory use.

Unity is therefore an excellent option for building systems that require customisable GUIs.

13.2.1.3 ECS architecture ECSs, as mentioned previously, have enjoyed great success and widespread use in the gaming industry. This is mostly due to their data-driven architecture, which simplifies the creation of game objects and the management of code.

In ECS implementation, the data part of an application is completely separated from the code logic part. This allows new entities to be defined without needing to worry about the code that they use, as they are defined purely through data in the form of components.

The logic code that performs action on the entities, and ultimately their data, is kept in systems, which have the advantage of only needing to change logic code in a singular location. This makes development of games more efficient from a time perspective and makes it easier to maintain the code.

Systems apply their logic code to all entities that are made up of particular components, executing the code against all entities with the given component at the same time. This simplifies calculations and memory use.

Figure 13.4 illustrates an example of the ECS architecture in practice in the context of a simple shooting game. This illustrates how systems only execute against entities with particular components, thereby reducing load, as not all entities need to be considered.

In Chapter 9 the DEEv-MoS ECS was defined, it describes the use of an ECS to create an environment where large numbers of entities can be created without impacting the performance of the system. Figure 13.3 depicts the DEEP ECS as previously presented in Chapter 9.

The DEEv-MoS ECS makes use of the data-driven approach that characterises ECS implementation in general, and implements the three main parts of an ECS:

 entities,  components, and  systems.

182

Figure 13.3 Component diagram of the DEEP entity component system.

183

Figure 13.4 Example of the Unity ECS in practice (Unity Technologies, 2019). The Unity ECS has existed from the early days of the game engine, but has had improvements made to it over time for better performance and easier use. In March 2019, Unity announced their new overhaul of their ECS, called the Data-Oriented Technology Stack, or DOTS.

DOTS provided an improvement to the structure of the Unity ECS by moving it further away from an OOP implementation of an ECS through classes to being more in line with the core concepts of an ECS. It also addressed some of the memory and performance issues that were created by the use of objects to represent game objects: mainly load times and serialisation performance.

Developing a pure ECS is not a simple task, and would require the use of a programming language designed for it. It is easier to create an ECS implementation through the use of lightweight OOP classes to mimic the functionality. In saying so, there are many complications and considerations that need to be made in developing an ECS in this manner.

184

Memory management and a number of other sub-tasks would also need to be addressed, which can introduce complications and heavily impact the system if done incorrectly. It therefore makes sense to leverage the expertise of the Unity DOTS ECS developers when implementing an ECS-based application rather than building it from scratch, especially if there is nothing new or different that you want to implement in the ECS.

The benefits of using Unity for the ECS architecture are as follows:

1. it is a tested and performant ECS system that is designed to function at industry- accepted levels; 2. less time is required to implement environment entities and the systems that act on them; and 3. there is a reduced risk of bugs and performance issues that could be introduced in a custom implementation.

Unity is therefore an excellent option for implementing and ECS based application that is performant and easy to use.

13.2.1.4 ML integration In the modern world, the use of ML techniques is growing in all sectors of business and people’s daily lives to simplify tasks, predict future events and increase efficiency. Due to its popularity, a large number of tools relating to ML have been created over the past few years to make it easier for researchers and developers to make use of ML. These tools include the likes of TensorFlow, scikit-Learn, Keras, Theano, PyTorch and many more.

Many of these libraries are exposed via Python and work best when being used in Python applications, thus slightly limiting their mass use.

Machine learning is an important function in the DEEv-MoS model, it is used to predict the evolution of the DEEv-MoS digital environment as a whole, as well as for some individual components.

In Chapter 10, the DEEv-MoS predictive modelling engine was defined, discussing the types of predictions it would need to make in the DEEv-MoS digital environment, what entities they would apply to and what ML models would be best to use.

Unity ML Agents was created to simplify the integration of ML techniques into game development for the creation of more competitive and realistic game AI and recommendations.

It allows developers to easily integrate existing ML libraries and code into Unity projects to make use of their capabilities, even though most of the programming in Unity from the developer point of view is done in C#.

185

This advancement has opened avenues for researchers to study behaviours and interactions from a visual standpoint, and for manufacturers to test and develop ML based AI for automated tasks such as assembly, driving and more.

The benefits of using Unity for ML integration are:

1. ease of use with existing industry standard ML libraries, 2. seamless integration of Python ML code with C# Unity code, and 3. shorter time to implement ML in a single simulation system.

Unity is therefore an excellent option for ML integration into a simulation system, without needing to set up complex API calls.

The implementation of various components of the DEEv-MoS model is important, especially in the context of the development platform that has been chosen. Section 13.3 covers the implementation of the DEEv-MoS model using the Unity game engine.

13.3 Implementation of DEEv-MoS prototype The DEEv-MoS model is comprised of a number of important components, each of which were detailed in Chapters 8 to 12. There are:

 the DEEv-MoS digital environment,  the DEEv-MoS entity component system,  the DEEv-MoS predictive modelling engine,  the DEEv-MoS constraints engine, and  the DEEv-MoS events engine.

All of the above components need to be implemented in the Unity game engine in order to create a working DEEv-MoS prototype system. This section will consider each component, describing how it is implemented and any considerations required for the prototype system. Figure 13.5 shows a simulation run using the DEEv-MoS prototype system with annotations.

Figure 13.6 depicts the high-level structure of the DEEP prototype system, indicating entities, components and systems in terms of Unity.

The DEEv-MoS digital environment will be covered first.

186

Figure 13.5 Sample run of a simulation with the DEEP prototype system. 187

Figure 13.6 High-level structure of DEEP prototype in terms of Unity.

13.3.1 the DEEv-MoS digital environment The basis of the DEEv-MoS model is the DEEv-MoS digital environment. It is the container in which the simulation will run and in which all the various entities of the model will exist.

The digital environment will exist as a scene created in the Unity platform. This will allow for it to be run as its own individual sandbox environment.

Scenes in Unity are files which are used to represent a unique environment, in which game objects and backgrounds are placed. Scenes can be of two different types:

 2-dimensional (2D), and  3-dimensional (3D).

The main differentiator here is the number of axes the environment makes use of in space. 2D environments can be likened to Cartesian planes, where there are only the X and Y axes, whereas 3D environments make use of a third axis, the Z axis.

For the DEEv-MoS prototype system, there is no need to create a 3D environment, so the digital environment will be represented using a 2D scene.

By creating a 2D scene, the prototype system will be able to visualise entities in the system on a Cartesian coordinate system, allowing different entities to exist at different points on the plane.

188

The use of a Cartesian coordinate system will allow the DEEP prototype to visualise a number of different entities in a defined space, where they can move based on manipulations of their location coordinates. In this way, intelligent agents will be able to roam the space, which will drive part of their behaviour, while other entities may be static with regard to location.

The next section will describe the implementation of the DEEv-MoS ECS, considering its entities, components and systems.

13.3.2 the DEEv-MoS entity component system The DEEv-MoS entity component system is the component of the model that is responsible for the data-oriented approach to implementing the DEEv-MoS model. It focusses on defining digital entities through the use of components which give the entities their properties, and which are executed against by a system designed to perform particular tasks that only involve entities with particular components.

It is therefore important to define the implementation of each of the three parts of the ECS: entities, components and systems. Through Unity, an ECS is already provided to make use of, and therefore the details around implementing the actual ECS are unnecessary. However, the implementation of each part in Unity is still important.

Each part of the ECS will be described below in regard to how they are implemented in Unity.

13.3.2.1 Components Components are the building blocks of entities in an ECS and give entities their properties. The entities in turn create an environment where interactions and behaviour can be modelled. The components supported in the DEEv-MoS ECS are tailored to give entities properties that are fitting for modelling the digital entities in the DE, while other components are ones that are required by Unity in order for the entities to be visualised.

Components will be considered in two categories for implementing the DEEv-MoS ECS, namely Unity components and custom components. The Unity components that will be used for each entity in the digital environment will be discussed first.

13.3.2.1.1 Unity components In Unity there are a number of out-of-the-box components that are widely used in all projects, as they provide basic functionality to objects and the environment itself, allowing the systems to execute against them and create a visual representation of the designed environment.

These out-of-the-box components are generally concerned with basic ’physical’ properties of game objects and describe the object’s position, size and appearance, and how it can be interacted with in the game environment.

189

The main out-of-the-box components are:

 the transform component,  the collider component, and  the sprite renderer component.

The most basic and mandatory component in Unity is the Transform component. A game object cannot be created without having this component, as it defines the object’s position, scale and rotation in the game environment.

The properties regarding the Transform component are defined using float type numbers, and can also be updated programmatically by various systems.

The Collider component is used to give a game object a collidable surface so that it can interact with other objects. The collider component defines an invisible shape around the object that specifies its collidable edges, and is often paired with a Rigidbody component which allows the object to interact with physics (2D or 3D).

The properties for the Collider component define the invisible shape of the collider and how it reacts to collisions, and can have a Rigidbody component attached, as well as a Physics material to define collision interaction based on the object’s base material.

The Sprite renderer is used to give 2D game objects a physical appearance through the use of sprites. Sprites are textures or images that are used to define the physical appearance of an object. In a 2D project this component allows for game objects to be seen in the scene.

The Sprite renderer can be altered with different images or textures to change the appearance of an object. A Sprite editor exists that allows for the editing and creation of sprites.

13.3.2.1.2 Custom components To enable the realistic modelling of the various digital entities in the Digital Environment, custom properties need to be defined that will allow for the entities to be replicated effectively. In Unity this is achieved through the creation of custom components where the user defines the components’ properties and values.

Each of the custom components needed for the DEEv-MoS entity component system, enable the system to effectively describe different entities correctly and for systems to interact with them so as to create the desired behaviours in the system as a whole.

The custom components that are needed for the DEEv-MoS entity component system are listed below, along with a description of the component’s purpose:

 Data component: this component serves to describe the data that an entity can possess, including the type of data and size limits.

190

 Computational component: this component defines the computational power of an entity.  Storage component: this component describes the storage capacity of an entity.  Data behaviour component: this component describes the data behaviour of an entity: how it interacts with data and the frequency thereof.  Connectivity component: this component defines the connectivity capability of an entity, including the type of connectivity.  Connections component: this component defines the connections that an entity has with other entities.  Devices component: this component defines the devices and types of devices belonging to entities.

Components are used to give properties to entities. The entities themselves are only high- level containers that define an archetype of a game object. They are important, however, in creating game objects with very specific properties or behaviours through the combination of various components.

The entities of the DEEv-MoS prototype system will be discussed in the following sub- section.

13.3.2.2 Entities The entities of the DEEv-MoS digital environment are created in the prototype system by making use of Unity entities. These entities are merely containers with a unique identifier to which components can be associated, and each entity is treated as its own game object.

The entities that are used in the DEEP prototype system mirror those that were defined in the DEEv-MoS digital environment component in Chapter 8. The six entities are:

 network node/server,  network path,  fixed endpoint device,  wireless endpoint device,  data, and  agent (user).

Each of these entities represents an entity archetype which is composed of a set combination of components. The components define the properties of the entities and what behaviours they have. Figure 13.7 depicts the visual representations of the entities in the DEEP prototype system including agents, wireless devices, wireless network paths, fixed devices, wired network paths and servers. Network paths are represented by straight lines

191 connecting entities, where black lines represent a fixed or cable connection, and white lines represent a wireless connection.

Figure 13.7 DEEP prototype system ECS entity representations: (from left to right) agent, wireless device, wireless network path, fixed device, wired network path and server. Below, each entity archetype is listed along with the components that it is comprised of:

 Network node/server o Transform component o Collider component o Sprite renderer o Storage component o Computational component o Connections component o Data component o Data behaviour component  Network path o Transform component o Collider component o Sprite renderer o Connectivity component o Connections component  Fixed endpoint device o Transform component

192

o Collider component o Sprite renderer o Storage component o Computational component o Connections component o Data component o Data behaviour component  Wireless endpoint device o Transform component o Collider component o Sprite renderer o Storage component o Computational component o Connections component o Data component o Data behaviour component  Data o Transform component o Collider component o Sprite renderer o Data component  Agent (user) o Transform component o Collider component o Sprite renderer o Devices component o Data behaviour component

Through the use of the above entity archetypes, the six main entities in the DE can be modelled and instantiated into numerous game objects, taking into account specific properties and behaviours which will allow for more accurate predictions and simulations.

Having entities and components defined helps to shape the environment. However, it is the application of systems to the entities that drives interaction and change. The following subsection will discuss the various systems of the DEEv-MoS ECS and what they are responsible for.

13.3.2.3 Systems Systems contain the logic code that is executed against the data contained in entities and components. They are an integral part of any ECS, as they drive action and change, without which the ECS would merely be a collection of data grouped by function.

193

Each system in the DEEv-MoS entity component system is responsible for a different aspect of running the prediction and simulation of the DE. They therefore interact with different entity archetypes based on the components they are composed of.

The systems in the DEEv-MoS entity component system separate execution code into modular sections that affect only specific components, while allowing the system to execute its code against all entities globally that contain those components. This makes the model extremely efficient for environments with large numbers of game objects.

The systems in the DEEv-MoS entity component system, just as with the components, can be considered in two categories: Unity systems and custom systems. Unity systems will be listed and described first.

13.3.2.3.1 Unity systems The systems in the DEEv-MoS ECS consist of a number of out-of-the-box systems from Unity that affect the out-of-the-box components. These systems allow for general commonly needed functionality to be created, and are:

 the transformation system,  the collider system, and  the sprite renderer system.

Each of these systems affect specific components in the ECS and are tasked with carrying out particular functions. These are as follows:

 Transformation system o Function: This system is responsible for the management of an entity’s positional properties. The transformation system applies transformations to entities, affecting their location and/or scale. This system allows for entities to be placed in particular locations and to be moved or scaled. o Components affected: . transform component  Collider system o Function: This system is responsible for the management of collisions between entities in the environment. The collider system allows entities with defined collision shapes to interact with one another in the environment. This system enables for physics to affect entities and allows for the visual interaction of entities. o Components affected: . collider component

194

 Sprite renderer system o Function: This system is responsible for the rendering of 2D sprites in the environment. The sprite renderer system applies a visual 2D appearance to an entity, allowing it to be seen in the scene and differentiated from other entity types. This system enables the visualisation of the simulation. o Components affected: . sprite renderer component

These Unity systems provide important functionality in the creation of a simulation environment that can be visualised and presented to users. The custom systems of the DEEv-MoS entity component system are defined next.

13.3.2.3.2 Custom systems Besides the out-of-the-box systems that come with Unity, there are also a number of custom systems designed to execute particular logic for the environment. These custom systems are the following:

 data system,  device system,  computation system,  storage system,  connectivity system,  connections system,  data behaviour system, and  events system.

With ECS implementations, as stated previously, systems are designed to execute against only a subset of the entities in the environment, dependant on whether the entity has a component or components of a particular type. It is important to understand which component types are affected by which systems in order to ensure that entities and entity archetypes can correctly have their data executed against in the running of the application.

Below, each of the listed systems in the DEEv-MoS entity component system will be detailed further, describing the purpose of the system and listing the components that are affected.

 Data system o Function: This system is responsible for the creation, deletion and alteration of data items in the environment. It also facilitates the transmission of data from one entity to another (i.e. from network node to fixed endpoint device). This system allows for a dynamic data landscape to exist, where data grows and is moved around and/or duplicated.

195

o Components affected: . data component . data behaviour component  Device system o Function: This system is responsible for the management of devices for agents in the environment. It allows agents to acquire and make use of new devices, as well as get rid of old devices. o Components affected: . devices component  Computation system o Function: This system is responsible for performing computations on data items in the environment. This system makes use of a device’s computational power to process data in performing tasks and can include the creation or duplication of data. o Components affected: . computation component . data component  Storage system o Function: This system is responsible for the management of the data storage of devices in the environment. The storage system allows for devices to store new data items and delete unneeded existing data items. This system makes use of data types and size to determine storage use. o Components affected: . storage component . data component  Connectivity system o Function: This system is responsible for the management of connectivity between devices in the environment. The connectivity system dictates the data speeds of connections between various devices based on the network path and its type. o Components affected: . connectivity component  Connections system o Function: This system is responsible for the management of connections to devices in the environment. The connections system keeps track of connections to devices and allows for the creation of new connections. This system allows for the shaping of the environment’s network structure. o Components affected: . connections component

196

 Data behaviour system o Function: This system is responsible for the data behaviour of agents and devices in the environment. The data behaviour system allows for agents and devices to carry out different data behaviours in their interaction in the environment. This system drives data interactions in the environment, shaping the data landscape. o Components affected: . data behaviour component  Events system o Function: This system is responsible for the creation and management of events in the environment. The events system executes events that can affect all entities in the environment or all entities of a particular archetype. This system can alter the numeric properties of components and/or create or delete entities. o Components affected: . transform component . collider component . sprite renderer component . storage component . computational component . connections component . data component . data behaviour component . connectivity component . devices component

The custom systems in the DEEv-MoS entity component system provide functionality for the execution of the prediction and simulation of the DE. These components encapsulate the capabilities of the defined DEEv-MoS model and create the dynamic interactions in the environment between entities.

Agents are an important entity in the DEEP prototype system that drive the dynamic nature of the environment. The following subsection is focussed on the agent entity and how they are generated.

13.3.3 DEEv-MoS agents Agents are key components on the DEEv-MoS model, as they are designed to model humans in the real world in terms of their interaction with data and the DE as a whole.

In the real world, the DE, its evolution and its growth have primarily been caused by human needs and interactions. Due to a greater need for communication, larger networks with more nodes were created, growing the DE’s infrastructure. The data landscape was also

197 greatly influenced by humans, as the need to share and assimilate information called for more data to be stored on servers and local machines and for it to be duplicated across nodes in the networks.

It is obvious then that modelling humans is a very important part of the puzzle when attempting to predict and simulate the evolution of the DE. Not only is the modelling of individual people important, but so is the growth of the human population and the changes in behaviour in regard to data. This is due to the following:

 With a larger population, there are more actors capable of interacting and creating or consuming data.  As technology and social interactions evolves, the both the need for and the types of data behaviours change, affecting how people interact with the DE.

From the above considerations, there are two very important functions that need to be accounted for in the modelling of people through agents:

 reproduction, and  behaviour.

The behaviour of agents drives the evolution of the DE through interactions. It is therefore important to model human behaviour in relation to data. For the purposes of the DEEv-MoS digital environment and agent entities, this will be considered using two types of behaviours:

 data creation behaviour, and  data consumption behaviour.

Each of these affects how the agent will function in the digital environment, and will be described further in a sub-section to follow.

In order to effectively model population growth, it is important to model the reproduction of humans in the agent entities. This involves sexual reproduction between two people of opposite genders to create offspring that possess characteristics of their parents, but can also deviate from these through mutation. In the DEEv-MoS model, however, agents are not reproduced in this manner, as there are not any genetic traits that need to be passed down through sexual reproduction.

Agents are rather generated based on a population growth rate and placed in the DEEv-MoS digital environment near existing agents or large servers, so as to effectively form new communities or grow existing ones. In this process new agents are created that are capable

198 of data creation and data consumption behaviours, which are not inherently associated through the genetic inheritance of properties from their parents.

There are two main reasons for this choice of method for introducing new agents into the environment:

 simplicity, and  computational overhead.

By simplifying the generation of new agents in the prototype system, it allows for focus to be placed on the agents’ behaviour and distribution across the digital environment. This ensures that realistic scenarios are created for digital environment evolution and that potential sources of inaccuracies are minimised.

To generate new agents using sexual reproduction in the DEEv-MoS prototype system would introduce further computational overheads that cannot be justified based on their outputs. Having to select two agents in the population and have them reproduce in a manner that creates an offspring with shared genes adds further computation and time than what is necessary. Even though the computational and time overhead is small in each individual case, when performed thousands of times in a short space of time, this adds up to have a noticeable impact on the prototype system. It is therefore favourable to make use of a simple agent generation scheme, especially when there are little to no added benefits for introducing a more complex and intensive generation scheme.

This leads to the DEEv-MoS agent generation process. The process is defined on a high level as follows: based on the current global agent growth rate (determined by the predictive modelling engine), N new agents are generated at a particular time step (i.e. per day, month, year). Each agent is assigned a value for its data creation and data consumption behavioural activities (these are doubles that add up to 1, signifying the percentage of them performing a particular behaviour). A small chance is given for an agent to be located in an area that is less populated, with a higher chance of being near a server at least, otherwise they are randomly placed nearby an existing larger community of agents. Each agent is then given a number of fixed and/or wireless devices based on the current environment constraints on device possession for agents, which they can make use of to carry out their data behaviours.

By implementing the process described above, new agents are efficiently added to the digital environment along with devices that can store data and connect to the network. This drives the data creation, deletion and copying throughout the environment, which in turn affects the evolution and growth thereof.

In order to drive the evolution of the digital environment effectively, it is important for the various growth rates, behaviour frequencies and technological advancements to be

199 determined at any point in time and for them to be accurately predicted in order to allow for an accurate simulation to take place. This function is carried out by the Machine Learning Engine, and is described in the following sub-section.

13.3.4 the DEEv-MoS predictive modelling engine In Chapter 10, the DEEv-MoS predictive modelling engine was described, detailing the function of this component of the DEEv-MoS model and how it would affect the digital environment.

In the DEEP prototype system, the predictive modelling engine has been implemented in Unity to perform its function in the digital environment and drive the evolution thereof. The ML Engine is responsible for predicting a number of important factors within the digital environment, which ultimately shape its growth and structure.

The factors that the DEEv-MoS predictive modelling engine is responsible for predicting are:

 agent growth rate,  network growth rate,  computational power,  storage size, and  network speed.

These four factors are important in determining the size of the digital environment at any given time, along with what data behaviours are performed and are effective. Agent growth rate is determined as the addition of new agents per time step used for the simulation, which can occur on a daily, monthly or yearly basis. Network growth rate is determined as the addition of new network devices per time step (including servers and both fixed and wireless devices) in the simulation. Storage size and network speed are both determined as the maximum value for each of these factors at a point in time in the simulation, and affect the abilities of newly generated devices and network paths in the digital environment.

Each factor is predicted by making use of a machine learning model, implemented in TensorFlow, which is then made use of in the Unity game development environment. Initially, each model is trained using historical data up to the simulation initialisation point (a specified date). Thereafter, as the simulation runs, new data points are gathered at particular time intervals and used to retrain the models with the historical data along with the newly generated data from the simulation.

In Chapter 10, a comparison between two ML NN approaches was given. The benefits and drawbacks of both extreme learning machines and recursive neural networks were presented. Based on the comparison ELM was selected as the approach of choice for the DEEv-MoS predictive modelling engine. However, while the implementation of the DEEP

200 prototype was underway, it became apparent that the simplicity of ELMs structure is in contrast with the complexity of its implementation.

Figure 13.8 High-level graph structure of DEEP RNN.

201

Figure 13.9 DEEP RNN node structure.

202

Figure 13.10 DEEP RNN hidden layer 1 node expanded.

203

Figure 13.11 DEEP RNN hidden layer 2 node expanded.

For this reason, the DEEP prototype rather makes use of an RNN for predicting the future values of factors in the digital environment. Figures 13.8 to 13.11 illustrates the structure of the RNN used in the DEEP prototype, visualised using tensorboard.

204

The specific RNN structure used is as follows:

 Hidden layer 1: 64 nodes.  Hidden layer 2: 32 nodes.  Hidden layer node type: long short-term memory.  Activation function: rectified linear unit (ReLu).

The RNN model used in the DEEP prototype is able to predict future environment factors fairly accurately. This can most likely, at least, partially be attributed to its suitability for time series data prediction. Further details regarding the performance of the RNN presented in this section is presented in Chapter 14.

Making predictions of the important factors in the DEEv-MoS predictive modelling engine is important. However, the resultant predictions need to be made use of in the prototype system. This task is carried out by the DEEv-MoS constraints engine, which is discussed in the next sub-section.

13.3.5 the DEEv-MoS constraints engine The DEEv-MoS constraints engine is the part of the DEEv-MoS model responsible for maintaining and enforcing constraints on the digital environment. Chapter 11 discussed the DEEv-MoS constraints engine with regard to the model as a whole, breaking down the various constraints into groups and defining how this component functioned with the other components.

In the DEEP prototype system, the constraints engine is implemented in the Unity game development environment and consists of two distinct parts:

 constraints variables, and  constraints update code.

The constraints variables are global variables that exist in the DEEP prototype system, which are used when generating new entities and performing various actions. These variables define the limits for technological components of entities in the DEEv-MoS entity component system, as well as the growth rates of the different entities.

The constraints update code is the part of the constraints engine that maintains the constraints variables and performs changes to their values. This occurs as a result of new information coming from the predictive modelling engine so as to shape the digital environment.

In Unity, the constraints update code is implemented as a MonoBehaviour which is attached to an empty game object. A MonoBehaviour is the based class used in Unity from which all scripts derive. Scripts in Unity handle the main execution of a program and come with the

205 ability to start, stop, update, enable and disable entities in the given program. Scripts do not need to be invoked explicitly, but rather continually run from application start (Unity Technologies, 2019). This allows for the creation of executable code that is run on every update of frame or a particular time step. In the case of the DEEP prototype system, update code is run on time steps, as to run it on every frame is a wasteful use of resources.

The constraints engine ensures that other components of the DEEv-MoS model have the correct constraints in which to operate. These components, such as the DEEv-MoS events engine, the DEEv-MoS digital environment and the DEEv-MoS entity component system, are therefore dependent on the constraints engine.

By making use of Unity’s built-in ECS system and having the Constraints Engine run as a system, it is ensured that it can impact global variables and be executed on a regular basis without requiring it to be called from any specific entities. In doing so, the Constraints Engine can guarantee that global constraints used by other components of the prototype system are always up to date and reflective of the predicted change in the digital environment.

Constraints are useful for giving the prototype system bounds within which to operate. However, a separate component of the DEEv-MoS model actually carries out events that impact the evolution of the digital environment. The Events Engine is tasked with this job and is covered in the next sub-section.

13.3.6 the DEEv-MoS events engine The DEEv-MoS events engine, as detailed in Chapter 12, is the component of the DEEv-MoS model that is responsible for actually carrying out the generation, termination and modification of entities in the DEEv-MoS digital environment.

Events that occur in the DEEv-MoS model can be categorised into three main categories:

 population events,  social events, and  technological events.

Each of these corresponds to the creation, deletion or modification of particular entities in the DEEv-MoS model.

From an implementation point of view, the events engine, as with the constraints engine, is implemented as a MonoBehaviour attached to an empty game object in the prototype system. This guarantees that the events engine code can be executed at any time interval update and affect the prototype system as a whole.

The execution of the events engine results in bringing the prototype system in line with the prediction of the predictive modelling engine and the subsequent constraints from the

206 constraints engine. The events engine generates events which drive changes specified by the predictive modelling engine, while making sure to keep aspects of the system within the bounds specified by the constraints engine.

The events that the events engine creates will affect a single aspect or type of entity in the prototype system based on the event category, as mentioned previously. The events for each category are listed below along with their affected entity type and purpose, starting with population events.

Population events are designed to adjust the growth rate of agent numbers in the environment and affect overall behaviour change.

 Population events: o Entity type affected: agent o Agent generation: This event generates N new agents in the environment, in line with the global agent growth rate. o Agent termination: This event terminates N agents over a specific age to maintain a younger agent pool and adjust overall behaviour dynamics. o Agent modification: This event modifies all agents in the environment to mimic changes of a new behaviour that is becoming common. This event adjusts agent behavioural frequency values.

Social events affect individual agents from a behavioural stand point, changing what other agents they interact with, along with behavioural frequencies.

 Social events: o Entity type affected: agent o Adjust behavioural frequency: This event changes the behavioural frequency values for an agent to change how it interacts with the digital environment. o Adjust interaction locality: This event alters the distance over which an agent will interact with others, with greater distances allowing for cross-community interaction and shorter distances involving interaction with local agents only.

Technological events affect the underlying technology in the DEEv-MoS digital environment, changing the environment structure and what is possible with its devices and data.

 Technological events: o Entity type affected: servers, devices and network paths o Device generation: This event adds new devices to the digital environment, keeping them in line with current technological constraints. o Storage modification: This event modifies the storage capacity of a given set of devices to mimic technology upgrades.

207

o Network speed modification: This event modifies the network speed of specified network paths, mimicking the upgrade of communication infrastructure. o Data size modification: This event modifies the general size of data files that are generated and sent between devices. Doing so mimics the growth of data size as storage and network speeds increase.

Each of these events occur in response to the global growth rates for agents and networks, as well as in response to when environment constraints change. As events are created and executed, the structure of the environment changes along with the capabilities and behaviours of its components, ultimately realising the evolution of the digital environment.

Figure 13.12 and 13.13 illustrate the results of the events engine generating events to introduce changes to the prototype system, where the former is pre-event and the latter is post-event.

13.4 Conclusion Chapter 14 has described the implementation of the DEEP prototype system, discussing the platform used for the prototype development and why it was chosen, the implementation of the DEEv-MoS digital environment, as well as the implementation of the different parts of the model, namely DEEv-MoS entity component system, DEEv-MoS predictive modelling engine, DEEv-MoS constraints engine, and DEEv-MoS events engine.

The Unity Game Development Engine was discussed, highlighting the important benefits that could be gained by making of use of it, and how well it synergises with the DEEv-MoS model defined in Chapter 7. This use of Unity game development environment was motivated based on its advantages for implementing a multi-agent ECS system.

The implementation of each component of the DEEv-MoS model was described next, starting with the DEEv-MoS digital environment. The base properties of the environment with regard to a Unity prototype system were defined, choosing to represent the system in a 2D format.

Next, the DEEv-MoS entity component system was detailed, defining the entities that would be modelled as game objects in Unity, along with the components that they would have associated with them to provide properties. The systems that would affect the various components were discussed next, differentiating between pre-existing Unity systems and custom DEEv-MoS systems.

208

Figure 13.12 Small DEEv-MoS digital environment on initialisation.

209

Figure 13.13 DEEv-MoS digital environment after events that introduced further agents, devices and network paths. 210

The DEEv-Mos predictive modelling engine was covered next, detailing the implementation of the ML approach chosen for the DEEP prototype. The main factors in the DEEP prototype that need to be predicted on an incremental basis were provided along with their impact on the digital environment.

Thereafter, the implementation of the DEEv-MoS constraints engine was described, highlighting the different types of constraints that need to be kept track of in the DEEP prototype. The application of the different constraints to the digital environment and other components of the DEEv-MoS model was provided.

Lastly, the DEEv-MoS events engine was covered, describing the types of events it effects on the digital environment as well as the impact and intended outcome of the events. For each event type, the targeted digital environment entities were listed.

The information presented in this chapter is of relevance to the following secondary research questions:

SRQ2: Can an MAS be used to simulate the various components of a digital environment?

SRQ3: Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

SRQ4: Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

SRQ5: Can ML, in particular extreme learning machines, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

SRQ6: How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

The DEEP prototype system was implemented making use of agents as an integral component of the system as a whole. Numerous agents exist in the digital environment interacting with other entities, and in doing so affecting change. This addresses SRQ2 adequately.

The DEEv-MoS model defined the use of an entity component system as the core architecture for defining a digital environment and constituent entities. The DEEP prototype realises the ECS through the use of Unity DOTS, forming a solid basis for the rest of the DEEv-MoS components to be built on. Through the use of ECS entities, components and systems, an accurate representation of a digital environment has been created in DEEP, answering SRQ3.

The agents in the DEEP prototype system are designed to represent users in the real-world DE, whereby they interact with the various entities in the environment. Through the

211 interaction of agents in the digital environment, change is effected, both from a behavioural and structural point of view. This change is analogous of evolution in the DE, answering SRQ4.

The predictive modelling engine, which is used in the DEEP prototype to predict changes in important factors, is a crucial component of the DEEv-MoS model. Through the use of ML in the predictive modelling engine, constraints are updated that determine how the digital environment is able to change at any point in time. SRQ5 is addressed by the fact that change in the DEEP prototype is initiated by the predictions of the predictive modelling engine.

The technological entities in the DEEP prototype do not interact with other entities nor change by their own volition. Rather, this process is initiated by agents and the predictive modelling engine, constraints engine and events engine. Through the interaction of these three DEEv-MoS components, changes to the digital environment are made, which include changes to technological entities. The changes enacted by the DEEv-MoS components can be equated to the incremental evolution of technological entities, addressing SRQ6.

Part Four continues with Chapter 14, where the testing of the DEEP prototype is detailed as well as the results thereof. The metrics used for testing are described along with their relevance and how they should be interpreted

212

213

14 Results In the previous chapters of this thesis, a problem was presented along with a proposed solution. The hypothesis for the solution was backed up through a literature review on a number of important and relevant related fields.

Thereafter, a model was proposed as a solution, defining each component of the model and the entities, properties and events involved. The model was then realised through the creation of a prototype system that implemented the model in a concrete application so that it could be used to run tests and gain insight into the model’s effectiveness.

Chapter 14 is focussed on the results of DEEP testing, what metrics were used, and how to evaluate the model. Later on, a critical evaluation of the research performed in this thesis is presented, summarising the research done, what contribution the thesis has made to the field and what improvements and future work could be carried out.

Lastly, a conclusion for the thesis is presented, summarising the findings and the knowledge discovered and ultimately providing an answer to the research questions posed in Chapter 1.

First the results of the DEEv-MoS prototype system will be presented, along with how the testing was setup and executed and what metrics were tracked.

14.1 Results Chapter 13 presented the implementation of the DEEP prototype system for the DEEv-MoS model which was created using the Unity game development environment. Once all of the DEEv-MoS model components were successfully implemented in the prototype system, it was important to determine the effectiveness of the prototype system as a whole.

To do this, the prototype needed to be adjusted for testing purposes so that multiple test runs could be performed and the results evaluated against given metrics. The adjustments included: refactoring the main code loop to not complete on one full execution but rather rest to begin the next run, the removal of rendering the visual environment (to improve speed) and the output of results in a single file rather than in individual files per run. The metrics chosen needed to cover multiple aspects of the prototype’s effectiveness in running the simulations accurately.

This section will cover the testing of the DEEv-MoS prototype system and the results thereof in detail, covering the setup, metrics and actual results. Firstly, the testing setup will be described in the following section.

214

14.1.1 Testing setup When performing tests to determine performance and accuracy for a given system, it is important to have a well-defined definition of what the testing includes, how the environment is initialised, as well as what hardware was used.

This section describes how the testing of the DEEv-MoS prototype system was setup and carried out.

14.1.1.1 Machine configuration The computer used to run the simulations can impact the performance and results of the prototype system. Therefore, in order to allow for repeatability, the specifications of the testing computer are recorded below.

Hardware specifications:

 Processor: Intel Core i7-3930K 3.2 GHz CPU  Memory: 32 GB (2x (2 x 8 GB Corsair Vengeance LP, Quad Channel DDR3 1600, DRAM Frequency 666.5 MHz, CAS Latency 9))  Graphics card: NVIDIA GeForce GTX 1050 Ti  Motherboard: Intel DX79TO Quad Channel  Power supply: Corsair VS650, 650 W  Hard drive: SanDisk 256 GB SSD + Seagate 1 TB (7200 rpm)

Software specifications:

 Windows 10 Pro 64-bit  Unity personal version 2019.2.0f1  .NET Framework v4.0.30319 (Build Engine v 4.8.3752.0)

Note: due to the computer system used being of a higher specification than average, the prototype system may run with deteriorated performance on lower spec machines.

14.1.1.2 Prototype setup In order for tests to be run in a consistent and repeatable manner, the prototype was adjusted to allow for the configuration of initialisation information. The most important part of this is the start date of the simulation along with the starting agent population size and starting server count.

By running tests using the same initialisation configuration each time, an average can be determined over a number of runs. The duration determines simulation duration, where depending on the set granularity, x number of days/months/years pass in each second of the simulation. Comparisons are made between the metrics at the start and end of the simulation to determine the performance of the DEEP prototype.

215

The following initialisation parameters were used for the prototype test runs:

 Duration: 5 years  Agent population: 20  Number of servers: 3

The above parameter values were selected in accordance to the performance of the DEEP model for simulations as well as providing a consistent starting point. To gauge growth of a digital environment effectively the period of time simulated needs to be sufficiently large for interactions and events to have a noticeable effect. Significant technology growth in various areas has been shown to occur roughly every 18 months, by using a 5 year period a number of these events can be factored into a single simulation run. This is balanced against the ability of the DEEP system to continue growing and adding further entities and interactions without drastically slowing down. Starting with a smaller initial population of agents and servers allows for smoother and more performant simulations.

Note: even though in the real world the actual human population and server count at any point in time would be multitudes greater, for the sake of performance and simplicity a representative number is used for agents and servers and at the end of the simulation these representative numbers can be converted to the real-world numbers that they represent. The representative numbers for each are as follows:

 1 DEEP agent = 4000 real world people  1 DEEP server = 30 real world servers

The above setup allows for a simulation of a five-year growth period in the digital environment that is repeatable. At the end of each test run, a number of metrics are captured and written to a file for the specific run. These metrics are separated into two groups as follows:

 Input parameters: o starting agents o starting devices o duration  Output results: o ending agents o ending devices o number of network nodes o average degree of nodes o average shortest path o distribution of node’s degrees

216

These outputs allow for the derivation of important metrics that can be used to determine the results of a simulation run. The result metrics along with some of the recorded metrics will be explained further in the next section.

14.1.2 Testing metrics To effectively test the performance of the DEEv-MoS prototype system, it is important to evaluate the output of each run against a set of criteria and definitions. First the criteria and their utility are given, along with definitions for any metrics that are recorded per run.

14.1.2.1 Average degree When referring to networks, the degree of any node in the network is the number of nodes connected to it by an edge. The average degree of the network would be the average of the degree for each node in the network, indicating how well connected nodes in the network are.

Higher degree nodes are favourable, as they provide many routes to other nodes in the network. Very highly connected nodes, which therefore have a high degree, are often referred to as ‘hubs’.

14.1.2.2 Growth rate A very important part of the simulation run is to determine the growth rate of the agent population and compare it to the growth rate for internet users for the same period in history. If the results from the test run can come close to the historical growth rate, then the run was a success with regard to this criterion.

The benchmark population internet user growth rate that will be used to compare test runs against will be for the period 2000 – 2005, during which, according to Roser, Ritchie and Ortiz-Ospina (2019), the internet user population grew from 412.8 million people in 2000 to 1.026 billion people in 2005.

This represents a target growth rate of 149.02%.

14.1.2.3 Small world networks Small world networks are a class of real-world networks in which there is a lower average shortest path when compared to random networks. However, they have a high clustering of nodes, as is the case with latticed networks (Telesford et al., 2011).

To consider the “small-worldness” of a network it is important to measure some key metrics:

 shortest average path length, and  clustering.

217

The shortest average path is the average of the shortest paths between all nodes in the network. The smaller this value, the more connected a network is, while larger values represent a network that is widely dispersed and has few highly connected hub nodes.

Clustering refers to the density of nodes in an area, along with how well they are connected with their neighbours. Smaller numbers here indicate less closely clustered and connected neighbouring nodes, whereas larger numbers represent highly clustered nodes that share many common neighbours.

Together these values can be used to determine how real-world-like a network is, as the internet is considered a small world network. In some cases, researchers have obtained a “small-worldness” value, which is calculated using the average shortest path and clustering and comparing them to the equivalent metrics for a random network and latticed network of the same size (Humphries & Gurney, 2008).

14.1.3 Testing results The results of the test runs for the DEEv-MoS prototype system were recorded and distilled to include the following metrics per run:

 growth rate,  average degree,  average shortest path,  clustering,  “small-worldness”, and  digital environment accuracy rating (DEAR): this is a combination of the above four metrics to simplify comparison. DEAR is calculated as follows: (Average Degree + Clustering) * “Small-worldness” / Average Shortest Path * Growth Rate.

15 simulations were carried out per test and their results recorded to also determine the averages across all runs. Table 14.1 lists the results of the best test run along with the overall averages. Further details of the effect of parameters on the test output is shown after.

The findings based on the results are as follows:

The results show that as networks grow to a larger size, the average degree of the network also increases.

As networks grow larger, so does the average shortest path for the network. This is seen in the results.

The clustering density of nodes in a network increases as the network size increases, which holds true for the test runs.

218

The “small-worldness” of the networks is greater than 1 (and in most cases greater than 3). This was calculated using the smallworldness function from R-package: http://sachaepskamp.com/qgraph/index.html (Epskamp et al., 2012).

The average growth rate for a five year period was 149.02%, while the obtained average result for the test runs is 170.94%. This is a comparable figure, showing that the prototype system internet user growth rate increased in a similar fashion to what occurred historically.

Table 14.1 DEEv-MoS prototype system test run results.

Run Growth Avg Avg Clustering Smallworldness Digital Rate (%) Degree Shortest Environment Path Accuracy Rating

1 201.2 4.57 4.98 5.64 3.205 1322.064

2 179.8 4.33 3.87 6.23 3.125 1533.178

3 162.3 3.82 3.69 4.89 3.862 1479.526

4 205.6 5.45 4.70 4.31 2.639 1126.716

5 158.1 3.97 3.24 4.36 2.606 1059.269

6 162.1 3.12 3.65 4.18 3.414 1106.819

7 171.9 3.81 3.42 3.98 2.794 1093.991

8 143.7 3.42 3.26 2.89 2.985 830.258

9 194.7 4.84 4.56 5.27 3.798 1639.484

10 163.6 3.85 3.65 4.26 3.450 1254.095

11 182.0 4.11 3.86 3.74 2.736 1012.674

12 153.6 3.71 3.92 4.02 3.655 1107.062

13 148.3 3.46 3.61 3.46 3.409 969.097

14 175.3 3.68 3.47 4.09 3.806 1493.971

15 161.9 3.24 3.49 3.82 2.906 951.747

Average 170.94 3.96 3.82 4.34 3.226 1198.182

219

Overall, three important considerations need to be made with regard to the test run networks:

 The lower the average shortest path, the better; the network is better connected and easier to communicate across without too many hops.  The higher the average degree, the better; higher degree nodes make it easier for short paths to other nodes to be created, with hub nodes being important to bridge gaps.  The higher the clustering density for nodes, the better; more closely clustered nodes lead to short physical and path distances where nodes are well connected to their neighbours.  The higher the “small-worldness” for the network, the better; networks with “small- worldness” values greater than 1 are considered small-world, while networks with “small-worldness” values greater than 3 follow a stricter definition for small-world (Humphries & Gurney, 2008).

Effect of Duration on DEAR 2500.000

2000.000

1500.000 5 Years 1 Years DEAR 2 Years 1000.000 10 Years 15 Years

500.000

0.000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Test Run Number

Figure 14.1 Effect of simulation duration on digital environment accuracy rating.

220

Effect of Agent Starting Population on DEAR 3000

2500

2000 20 Agents 1500 30 Agents DEAR 50 Agents 1000 5 Agents 40 Agents 500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Test Run Number

Figure 14.2 Effect of initial population size on digital environment accuracy rating.

Effect of Initial Server Number on DEAR 3500

3000

2500

3 Servers 2000 1 Servers

DEAR 1500 5 Servers

1000 10 Servers 15 Servers 500

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Test Run Number

Figure 14.3 Effect of initial server number on digital environment accuracy rating. The results of testing the effect of the three initialisation parameters: duration, initial agent population, and initial server count, are illustrated in Figure 14.1, Figure 14.2 and Figure 14.3 respectively. From these results it is apparent that the duration of the simulation does not affect the results majorly, while the initial agent population and server count do. Initial agent populations of between 20 and 50 agents are comparable to each other, while lower

221 populations negatively affect the DEAR. Similarly, initial server counts of between 3 and 15 servers are comparable to each other, while counts lower than 3 impact the DEAR negatively.

According to the above, the DEEv-MoS prototype test run networks meet the description of small world networks, where there are short average paths and high clustering of nearby nodes.

Figure 14.4 represents the distribution of the training data used for the predictive modelling engine. The predictive modelling engine was then able to achieve the prediction results in Figures 14.5, 14.6 and 14.7 over separate test runs, where it is can be seen that its prediction of future values is fairly accurate.

Figure 14.4 The distribution of training data for the predictive modelling engine.

222

Figure 14.5 Predictive modelling engine future prediction comparison test 1.

Figure 14.6 Predictive modelling engine future prediction comparison test 2.

223

Figure 14.7 Predictive modelling engine future prediction comparison test 3.

The results obtained from the testing of the DEEv-MoS prototype support its intended purpose and indicate that it is a viable solution for predicting and simulating digital environments.

14.2 Conclusion Chapter 15 detailed the testing results of the DEEP prototype system, implementing the DEEv-MoS model as described in Chapters 7 to 12.

The DEEP system was tested using a number graph and growth-related metrics to demonstrate the effectiveness of the system. Testing included the calculation of the results simulation network’s “small-worldness” and introduced the digital environment accuracy rating, which combines the growth rate, average shortest path, average degree, clustering and “small-worldness” into a single metric to simplify understanding and comparing simulations.

The input parameters of DEEP were explored to understand their impact on the system as well as the results from test simulations. The findings were used to determine the parameter setting for test simulations of DEEP.

The results of training and testing the predictive modelling engine were provided, demonstrating its ability to predict future values for the important metrics and constraints utilised in the DEEv-MoS model as described in Chapter 10.

The information presented in this chapter is of relevance to the following secondary research questions:

224

SRQ5: Can ML be used to predict and drive changes in a digital environment to accurately reflect its evolution?

SRQ6: How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

The DEEP prototype makes use of ML through its predictive modelling engine, which predicts the future values of important metrics and constraints used in the system. The predictions impact constraints and drive events which shape the change that occurs in the simulated digital environment. Through the testing presented in this chapter, it was shown that the resultant digital environments from performing DEEP simulations conform closely to their equivalent real-world counterparts, thus addressing SRQ5.

Digital environments in the real world have been studied thoroughly, leading researchers to develop metrics to measure their attributes. The metrics used hold true throughout the evolution of the digital environments, making them useful tools for understanding the evolution of these non-natural entities. The average shortest path, average degree, clustering and “small-worldness” of networks are ideal indicators for measuring their conformity to the real world. In this chapter it was shown that the resultant digital environments from DEEP simulations adhere to the manner in which non-natural entities in the DE evolve, addressing SRQ6.

The following chapter will consider the thesis as a whole, summarising the thesis, evaluating the research conducted, detailing the contribution made to the research field, performing a critical evaluation of the DEEv-MoS model, discussing future work that can be done, and, lastly, concluding the thesis.

225

226

15 Critical evaluation and conclusion This thesis has covered a number of research fields within the Computer Science space which are related to one another, each addressing a particular aspect regarding the problem domain and primary research question. Through the synergy of these various research fields, a novel model for predicting change in digital environments was created: the Digital Environment Evolution Modelling and Simulation model, or DEEv-MoS. The purpose of the model is to provide an accurate method of modelling and simulating the change of digital environments over time, making it possible to predict the evolution of the DE in the real world. This will further the understanding of the current DE, as well as what it could morph into in the future, allowing people to plan ahead and possibly avoid unwanted outcomes.

The research presented in this thesis was performed with the goal of addressing a primary research question, which was the driver behind the conceptualisation and development of the DEEv-MoS model. The primary research question is:

 Can an MAS, making use of Machine Learning and an Entity Component System design, effectively and accurately predict the evolution of a digital environment, so as to provide assistance in understanding what the future of a digital landscape will look like?

Before detailing the contribution of the thesis and performing a critical evaluation of it, a summary of the thesis will be provided.

15.1 Thesis summary Part One of the thesis covered the introduction to the problem domain along with how research would be conducted in the thesis. Part One consists of Chapter 1, which is discussed below:

Chapter 1 introduced the thesis and provided a short assessment of the problem and the methodologies that would be used in conducting the research. The chapter provided a brief introduction to the concept of digital environments and presented the problem background in detail, highlighting the need to be able to predict future growth and change in the DE. Thereafter, the thesis objectives, methodologies used and research questions which are used throughout the thesis were defined.

Part Two was concerned with the literature review of relevant fields related to the problem and possible solution. This provided important information which was used to define the model later. Part Two consists of Chapters 2 to 6, which are discussed below:

227

Chapter 2 provided a detailed coverage of the DE, defining what it encompasses and how its various parts drive its evolution. The changes in the DE are categorised into social changes and hardware and software changes, which drive its evolution in different ways. It is ultimately shown that the DE is an amalgamation of the physical, digital and cyber worlds, making it a complex entity to understand and predict.

Chapter 3 covered the topic of intelligent agents. The different agent types and their applications were highlighted, discussing the impact of the environment they are deployed in and the need for multi-agent systems. MASs were discussed with regard to the impact on agent decision making and goal achievement. Finally, agent evolution was discussed in the context of genetic algorithms and how they can be used to grow diverse agent populations.

Chapter 4 was focussed on the field of machine learning. The history and background of ML was discussed, detailing how it progressed to become the popular field of research it is today and how it is being used in business more frequently. Different types of ML approaches were discussed, including supervised learning, unsupervised learning and reinforcement learning. Deep learning was detailed with regard to its ability to improve learning of unknown domains and solve problems that previously required a large amount of human input.

The remainder of Chapter 4 discussed extreme learning machines, a specific neural network implementation that is popular for its fast learning rate. The difference between EML and more traditional FFNNs was highlighted, focussing on the layer/node architecture and the learning techniques employed. EML were shown to be effective NNs with great relevance in solving modern world problems.

Chapter 5 addressed programming paradigms. The difference between programming concepts, programming paradigms and programming languages was highlighted along with the relationships between these three core parts. Important programming concepts were discussed along with the impact of including them in a paradigm. Object-Oriented- Programming and entity component systems were briefly defined and compared as important programming paradigms, and the value of ECS approaches over the use of OOP in particular scenarios was highlighted.

Chapter 6 took a more detailed look at the problem background presented in Chapter 1. The problem was broken up into three main areas of influence: the growing world, advancing technology and evolving interactions. These three areas more clearly defined the various moving parts involved in the DE and how they affect it as a whole, highlighting the importance of being able to understand its future evolution.

Part Three detailed the DE Evolution Modelling and Simulation model, or DEEv-MoS, considering it in the context of the DE, its constituent components and their function. Part Three consists of Chapters 7 to 12, which are discussed below:

228

Chapter 7 presented an overview of the DEEv-MoS model; this was the result of combining the findings and information presented in the previous seven chapters. The functioning of the model along with its components was defined, briefly discussing the relationships and responsibilities of each component. The DEEv-MoS model was positioned with regard to the context of the DE and its many constituent parts.

Chapter 8 took an in-depth look at the DEEv-MoS digital environment component of the DEEv-MoS model. The difference between the DE and a digital environment was highlighted for the purpose of the DEEv-MoS model. The components of the DE that are modelled in the DEEv-MoS digital environment were defined, along with their properties and abilities, which were defined in terms of an ECS. Lastly, the DEEv-MoS digital environment was described in terms of task environments and a PEAS description.

Chapter 9 took an in-depth look at the DEEv-MoS entity component system component of the DEEv-MoS model. The DEEv-MoS model was broken down in regard to the three main parts of an ECS: entities, components and systems. Each entity in the DEEv-MoS digital environment was defined as one of the three parts, specifying its function and interactions with other entities.

Chapter 10 took an in-depth look at the DEEv-MoS machine learning engine component of the DEEv-MoS model. The main factors of the DEEv-MoS model that need to be predicted in order for accurate evolution of the digital environment to occur were defined. The importance and role of each factor was described, along with the entities in the DEEv-MoS digital environment that they affect. For each factor, the planned ML model, inputs and outputs were described in order to achieve future predictions.

Chapter 11 took an in-depth look at the DEEv-MoS constraints engine component of the DEEv-MoS model. The function of this component of the DEEv-MoS model was described, detailing the interaction between itself and the other components of the DEEv-MoS model and how constraints are used. The three main types of constraints: digital environment constraints, entity component system constraints, and events engine constraints, were discussed, and the individual constraints that are made use of in the DEEv-MoS model for each type were defined.

Chapter 12 took an in-depth look at the DEEv-MoS events engine component of the DEEv- MoS model. Firstly, the interaction between the events engine and the constraints engine components of the DEEv-MoS model was detailed further. Thereafter, the three main event types in the DEEv-MoS model were described: population events, technological events and social events. The different events created by the events engine were described, including their intended purpose and how they affect the digital environment and other components of the model.

229

Part Four described the implementation of the DEEv-MoS model as a prototype in order to obtain results. A critical evaluation of the research and model was conducted. Part Four consists of Chapters 13, 14 and 15, which are discussed below:

Chapter 13 detailed the implementation of the DEEP prototype system. Considerations regarding the development platform were presented and the Unity game development platform was chosen for the implementation of the DEEv-MoS prototype. The implementation of each component of the DEEv-MoS model was described in detail, highlighting features from the chosen development platform which complemented each one. The various entities of the DEEv-MoS digital environment were presented in the context of the implementation using Unity and the chosen ECS programming approach.

Chapter 14 presented testing results of the DEEP prototype system. Input parameters were discussed along with the output metrics of the simulations, describing their relevance to measuring the evolution of a digital environment. Exploratory results of the parameter space were presented to determine ideal parameter choices for use in simulations. The results of the predictive modelling engine were presented, demonstrating its effectiveness in predicting future values in the given simulation digital environment.

The conclusion of this thesis is presented in Chapter 15, where results obtained from testing on the DEEv-MoS prototype system are provided along with a summary of the thesis, a critical evaluation of the research conducted and possible future works. Finally, a conclusion is presented, summarising the problem statement, research questions and the results obtained.

15.2 DEEv-MoS contribution to the research domain The DE has been shown to be an extremely complex entity that exists at the intersection of the physical, digital and cyber worlds. An ever-increasing human population has created more actors capable of interacting with and growing the DE, each creating and consuming information and communicating with others on a daily basis. With technological advancements, ever-greater numbers of electronic devices are interconnected, possessing greater computational power, storage and connectivity capabilities, and thus growing the DE’s network continually. Mobile technology proliferation has allowed for people in undeveloped areas to be connected to the world in the same way as people in more developed regions, creating a truly global space for everyone to interact in.

The impact of these factors has led to the DE becoming an ever-growing, ever-changing behemoth of information, hardware and people that is beyond the comprehension of most people. Understanding how it currently functions, never mind what it will become in the future, is already a difficult task. A solution is required that allows people to simulate and predict what the future evolution of the DE will result in. The solution needs to be able to make use of existing technologies and concepts to accurately and efficiently predict future changes.

230

In order to effectively predict its evolution, numerous different entities involved in the DE need to be modelled. Some of the entities are non-natural, such as hardware and information, while others, such as people, are natural. This creates a need to be able to model both natural and non-natural phenomena. People think and act independently, interacting with numerous devices, data entities, infrastructure and other people on a daily basis. Devices and information itself experience multiple interactions a day with other devices. The DE is a dynamic environment made up of many different moving parts which are constantly interacting with one another, resulting in change through information dissemination and the need for network growth (from both a social and technological standpoint).

The DEEv-MoS model proposes a method by which this complex environment can be modelled through the modelling of individual entities, their properties and their interactions using an approach that was created for solving data-driven problems.

In order to effectively model non-natural entities, their properties and their evolution, an entity component system architecture can be used which allows for the definition of entities that are defined by the components they are made up of. This also allows for greater efficiency and maintainability of the solution, as it separates data (components and properties) and code (methods and behaviours), creating a more modular and robust application.

Machine learning algorithms can be applied to various problem spaces to help computers learn information they have not been intrinsically programmed with. By making use of ML algorithms, the rate of technological advancement and human population growth can be predicted, filling in the final piece of the puzzle.

231

Table 15.1 Summary of secondary research questions and their related chapters.

Research Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Question

SRQ1 Understanding Understandin Definition of of the DE. g of the DE. DE.

SRQ2 Definition of Describes Components DEEP prototype multi-agent DEEv-MoS of DE in MAS. as an MAS. systems. model which is designed as an MAS.

SRQ3 Definition and Use of an Entity DEEP prototype benefits of entity component using entity entity component systems component component system in applied to DE. system systems. the DEEv- architecture. MoS model.

SRQ4 Intelligent Definition of Agents and DEEP prototype agent machine predictive with evolvable definition learning. modelling digital and use. engine change environment. the digital Evolution of environment. artificial entities.

SRQ5 Machine Predictions in Updating DE Driving DE DEEP prototype Results of learning uses DE. constraints. change through with predictive predictive in prediction. events. modelling evolution engine. simulations. Extreme learning machines and application.

SRQ6 Predict digital Update Affect digital DEEP prototype Comparison of environment digital environment using predictive test results to changes. environment changes. modelling, real world. constraints. constraints and events engines to drive change.

232

In combining the interactions of intelligent agents with the efficiency and modular construction of ECS, and finally with the predictive capabilities of machine learning algorithms, a single system is formed that is capable of modelling an extremely complex environment in an effective manner.

Each field of research covered in the chapters of this thesis has provided insight into forming the greater picture of how the DE can be simulated and its evolution predicted. Together the information presented answers the primary research question of this thesis, with each chapter providing answers to the secondary research questions that were derived from the primary research question.

Next, the primary research question is reviewed along with the secondary research questions in the context of the information presented in this thesis as a whole.

15.2.1 Research question review 15.2.1.1 Primary research question  Can an MAS, making use of Machine Learning and an Entity Component System design, effectively and accurately predict the evolution of a digital environment, so as to provide assistance in understanding what the future of a digital landscape will look like?

Chapters 7 to 12 defined the DEEv-MoS model, making use of various algorithms, designs, concepts and constructs from the research fields of Intelligent Agents, Multi-Agent Systems, Programming Paradigms and Machine Learning. Each component of the DEEv-MoS model was designed to address a particular aspect in providing a solution to the problem stated in Chapter 1. The components each drew inspiration from the research fields covered in the literature review, making use of key properties, behaviours and designs presented, with the intention of having them all function together to create a viable solution. From the conceptual model, a prototype system was implemented to test the hypothesis that the model could provide a solution to the problem. The prototype was used to obtain test results, which that proved to be promising.

15.2.1.2 Secondary research questions Table 15.1 summarises the secondary research questions along with the chapters relevant to each research question.

 What is the current understanding of what the Digital Environment is, and what does it encompass?

The DE is a large entity that is made up of numerous moving parts from differing ‘worlds’. It can be seen as being an amalgamation of hardware, software, data, infrastructure, people and social networks, as it encompasses the physical world and machines, the digital world of

233 software and data, and the cyber world of information relationships and interactions. Traditionally, it would be considered to be the Internet. However, that proves to be too narrow a view, as the Internet is considered to consist of the information, computers and networks that facilitate world-wide communication; the DE, on the other hand, is composed of the Internet along with people, their relationships and their interactions.

 Can an MAS be used to simulate the various components of a digital environment?

Multi-Agent Systems represent environments in which numerous autonomous agents exist and operate, each acting to achieve its own goals. Chapter 3 outlined the structure and function of various types of intelligent agents, which are capable of using sensors to perceive their environment and actuators to act upon it. This can be seen to be analogous to humans in the DE, where their interactions and actions affect others in an attempt to fulfil their needs. Intelligent agents have been used before to simulate natural organisms, their behaviours and their interactions. Agents can therefore effectively be used in the setting of an MAS to model humans, allowing for independent action, large-scale interaction and effective reproduction.

 Can an entity component system design be used effectively to represent a digital environment and its constituent entities?

Chapter 5 considered programming paradigms and their relation to programming concepts. Object-Oriented-Programming was defined as a popular paradigm that makes use of a number of in-demand concepts in problem solving. However, this approach suffers from problems caused by its use of inheritance. Applications can become extremely messy, bloated with code and riddled with inheritance dependencies, making them difficult to maintain. Objects are also each responsible for the methods on their data, meaning multiple execution of the same code occurs when numerous objects of the same class exist, creating inefficiency.

Entity component systems address this by separating data and code, allowing objects to be defined through composition, where an entity is given properties based on its constituent components. This makes it much easier to manage individual objects while also defining entity types that are reusable. The code that executes on the entities is contained in a single system which executes globally against entities that contain particular components, making it easy to maintain and more efficient, as only one source of code execution exists. The ECS approach is well suited for defining non-natural entities by merely defining or changing their components, thereby altering their properties.

234

 Can agents be designed making use of AI and ML principles so that, along with defined heuristics, they are able to mimic the evolution of components in a digital environment?

Intelligent agents can be designed to make use of performance measures, heuristics and goals that drive the actions they take in an environment. By altering the agents’ properties and actions through their performance measures or goals, evolution of the agent can be simulated. Machine learning algorithms can be used in conjunction with historical and generated data to predict future values for agents’ properties and those of other entities in the environment. By introducing the predicted values into the entities and agents, the effect of change can be created, ultimately mimicking evolution. Considering that non-natural entities are defined by their components, changing and advancing the components and their capabilities is akin to evolving the entities themselves, as over time their components are upgraded and replaced.

 Can ML, in particular ELM, be used to predict and drive changes in a digital environment to accurately reflect its evolution?

Historical data can be used with ML algorithms to learn what future values could look like. By making use of ML algorithms to predict the future values of key components and entities, changes can be made to the digital environment and its entities which would represent their change over time. In the real world, change is incremental and constant, and by introducing this into the digital environment through ML predictions and then affecting the changes through constraints and events (as discussed in Chapters 10 through 12), an accurate evolution of the digital environment can be achieved.

 How can existing AI algorithms be used to evolve entities in a digital environment in a manner analogous with how non-natural entities evolve?

As covered in Chapters 9, 10, 11 and 12, with the combination of an ECS, ML algorithms, constraints and events, changes to large numbers of entities in the digital environment can be carried out. With natural entities the changes revolve around behaviour and interaction, while with non-natural entities changes take the form of property alteration. Increasing the computational power and storage capabilities of a computer can be seen as evolving the computer; with incremental changes to differing properties occurring over time, the computer can evolve to be capable of much more than it previously was. In the real world, this is driven by technological advancements, replacement of components and human requirements for performance from the machines. In the DEEv-MoS model, components are replaced through altering property values through events that occur within the bounds of constraints that are predicted by ML.

235

By addressing each of the secondary research questions, aspects of evolution and prediction of the DE were highlighted. In doing so the evolution of a given digital environment can be predicted and measured through the grown of the environments components, data and interactions.

The next section will provide a critical evaluation of the thesis and the DEEv-MoS model.

15.3 Critical evaluation In this thesis, research was conducted with the aim of addressing the problem of predicting and simulating the evolution of the DE. Due to its complex nature and the multiple facets of the world that it covers, this can prove to be a very difficult task. Numerous entities and their interactions need to be modelled, along with how they change and grow over time. To address this problem, the DEEv-MoS model was proposed.

The DEEv-MoS model is comprised of a number of components, each responsible for addressing particular aspects of the problem domain so as to create a solution that is as accurate and efficient as possible. By making use of intelligent agents, entity component system design and Machine Learning algorithms in a Multi-Agent Environment, the entities of the DE were successfully modelled in the DEEP prototype system, as described in Chapter 13. This prototype was tested and the results, presented in Chapter 13, proved to be positive.

15.3.1 Critique of the model The DEEv-MoS model and its prototype were developed using a finite set of digital entities that were modelled from the DE. The entities and their properties were selected based on their perceived importance in the growth and change of the DE over time, as well as their involvement in interactions with other entities. The test results, though positive, only indicate that the DEEv-MoS model is effective at simulating the evolution of a small-scale digital environment with those entities. Further entities and factors would still need to be assessed in order to determine the model’s application on a much larger scale. The view point of digital environments being a subset of the DE as a whole is a limitation of the proposed model and only really applies to certain types of linearly composable systems. This also ties into the consideration of the DE as a complex system with emergence, where ultimately the whole is more than the sum of its parts.

The generation and behaviour of agents in the model has a large impact on the effectiveness of the simulation. The currently used method of random agent generation within the environment’s constraints, though less resource-intensive than a more comprehensive approach, could lead to the loss of factors of influence in the overall evolution of the digital environment as a whole.

As humans are the key driving factor in the growth and change of the DE, a greater emphasis could be placed on modelling them, their relationships and their interactions

236 more accurately. In the DEEv-MoS model they are one of many entities that enjoy a shared level of impact on the digital environment as a whole, rather than being the focal point.

Machine learning algorithms are capable of learning within a problem domain in order to perform predictions or identify objects or scenarios. The application of ML in the DEEv-MoS model is contained to the prediction of key digital environment factors’ values. However, it could be applied to create more intelligent accurate agents and possibly identify scenarios where particular events are likely to occur. This would lead to more realistic simulations of a digital environment’s evolution over longer periods of time. The use of a NN in the model also limits its ability to effectively predict dynamically changing environments or emergence in complex systems. Almost all supervised learning approaches will suffer from this, though it can be alleviated to some degree with modifications as discussed in section 8.1.

The DEEv-MoS model understands that entities in the DE come in many forms and that even entities of the same type can differ greatly in functionality and characteristics. The use of an entity component system design and architecture for the running of the simulations provides the benefits of improved performance and enhanced prototype maintainability. Adjustments to entity archetypes and even system code can be carried out in a simple manner requiring changes in only a single location. By adding components to entities, advanced versions of the entities or even completely new entity archetypes can be created. The ECS design allows for flexibility and greater future configurability and creativity in entity creation (think a new type of entity that represents a technology paradigm shift). The ability for the prototype to create new entity types on the fly is an important one for long-term simulations that span decades, and is covered in the following section as part of future research.

The DEEv-MoS prototype allows for adjustments to be made to important initialisation properties of the simulation, including the starting time period, initial populations of agents and devices and the behaviour constraints on entities. By adjusting and optimising these properties, more accurate simulations can be run for specific time periods.

The DEEv-MoS model and its prototype are capable of effectively modelling a number of digital entities in a digital environment over a short span of time. In order for the model and prototype to perform better at a larger scale and over larger time periods, future research needs to be conducted into a number of areas.

In the recorded results, as with the development of the DEEv-MoS model, emphasis was placed on the measurement of evolution in relation to communication network and data volume growth. This reduces the dimensions modelled and evaluated in understanding the DE’s evolution, and would need to be expanded on in future to create a more complete and accurate model. The DEEv-MoS model has illustrated however that part of the evolution of the DE can be modelled effectively while focussed on two main dimensions.

237

The areas for future research are discussed in the following section.

15.3.2 Future research As discussed in the previous two sections, the DEEv-MoS model is by no means perfect or fully comprehensive. In its current state it provides a strong base from which improvements can be made in various areas through future research. This is the case with the majority of research undertakings, and the specific areas for future research are discussed in the sections below.

15.3.2.1 ECS customisation Entity component systems have been shown to be effective for the purposes of simulations in Chapters 5 and 9, with their data-driven approach to solution programming enjoying great success in complex environments, such as those in video games. ECS approaches allow for greater flexibility in object creation and alteration through the use of components and systems.

In the Unity game development platform described in Chapter 13, a large emphasis is placed on visualisation, which is necessary for games but is also useful to have for simulations. Visualising a simulation allows for easier comprehension and a more intuitive manner of seeing it unfold. However, it adds a large overhead which could rather be avoided in order to run larger simulation.

By creating a custom ECS platform, unneeded features can be discarded that would only impact performance, while considerations could be made specifically for the purpose of simulating the DE.

15.3.2.2 Agent behaviour Human interaction in the real world with the DE is complex and dictated by various wants and needs of the individual. On a societal level as a whole changes also occur which alter the general interaction of people with each other and technology. In order to create a truly accurate simulation of the DE’s evolution, the entity central to all the change and growth, the person, must be modelled with greater accuracy.

The dynamics of social relations and the need to communicate and be a part of something drives people’s data behaviours, causing them to either create, consume, collaborate or communicate in varying frequencies. By capturing this behaviour more effectively through complex goals, performance measures and heuristics and allowing them to be influenced by the agents in their social circles and immediate vicinity, agents could be created that truly mimic the behaviour of people in the real world.

15.3.2.3 Predictive modelling The use of ML to predict future values of simple factors in the DE allows for the guiding of the prototype system simulations in the right direction. However, it misses the nuances of predicating and identifying specific rare scenarios over longer periods of time.

238

The growth rates of various entities or their properties serve the purpose of driving the growth and change of the given digital environment; however, they fail to account for events that occur throughout history that change a specific field forever. These types of paradigm shifts can be equated to new types of devices, capabilities or technologies that did not exist before which when introduced have a very large impact on the future of the world (for example the Internet).

Supervised learning approaches will generally struggle with predicting emergence in a system such as the DE as discussed in section 8.1. To better deal with the emergent properties of complex systems such as the DE, unsupervised learning approaches may perform better, however the modelling of the environment would need to be more complete, very strictly controlled, and compartmentalised.

15.3.2.4 Digital Environment entity modelling In Chapters 7 through 12, the DEEv-MoS model was defined, discussing its various components, what their functions are and how they interact with one another. In the DEEv- MoS digital environment component, a number of entities of the DE are chosen to be modelled. The list of entities is far from exhaustive, but it covers the main entities involved in the interactions and changes in the DE.

Even though the chosen entities are key factors in the evolution of the DE, there are secondary entities and factors that could be considered in order to form a more holistic, and possibly more accurate, simulation. Chapter 2 described the complexity of the DE and indicated that it includes all entities from the physical, digital and cyber worlds, many of which were not considered for the model. Key entities and factors that could improve the model and prototype include but are not limited to:

 electricity,  cost (of hardware and infrastructure),  network switches,  data transfer and communication protocols, and  online communities.

The DEEv-MoS model ultimately focusses on modelling the evolution of the DE in terms of communication network and data volume growth, which is an incomplete view of the evolution of the DE in the real world. Further research can be done to model and evaluate the influence of the other factors involved, including the factors listed above.

The introduction of the above could create an entirely new dynamic in the simulation and lead to different, and possibly better, results.

239

15.3.2.5 Testing and evaluation metrics The metrics chosen for evaluation in Chapter 14 are a small number of metrics that do not measure exact parity between the simulation and the real world, but rather measure properties of the resultant simulated environment that should be of a similar form to those in the real world. By measuring the properties of the simulation network, we can determine a good enough similarity based on the nature of the real-world counterpart, rather than looking for exact matching across all aspects.

Greater comparability could, however, be achieved through the use of more metrics tied to known behaviours and resultant states in the DE. These could include network data size and distribution, agent community size and social circle size and a number of others.

15.3.2.6 Agent reproduction In the DEEv-MoS model, agents represent people in the DE, which are the basis for its existence and drive the interactions and changes that affect it on technological, population and social levels. In the DEEv-MoS model, agents are generated based on a population growth constraint determined by the ML engine. The agents are not created through sexual reproduction as is the case in the real world, but rather within constraints set by the system.

Through the simulation of sexual reproduction, agents could be created in a way that more closely resembles reproduction in the real world, with parents passing their genes to their offspring, thereby dictating human evolution and trait inheritance. This may impact the evolution of the DE due to more accurate modelling of behaviours and their inheritance from parents and the influence of the world local to the new agents.

15.3.2.7 ECS and ML performance and optimisation. For the development of the DEEv-MoS prototype system, an existing development platform, Unity, was used in order to accelerate development and focus on the core of the model and its functionality. This game development platform, even though it is designed for ECS-based game designs, provides a lot of features and overhead that is not necessary for the prototype. Through more efficient practices in developing the prototype with Unity, greater performance could be gained, allowing for larger simulations to be run with more factors considered.

Another approach could be to create a bare-bones ECS platform designed purely for the DEEv-MoS model, optimising its performance and removing unneeded overhead. This would, however, be a very large undertaking on its own.

The predictive modelling engine makes use of well-known and widely used ML algorithms to predict simple values for key factors in the growth and change of the DE. The predicted factors may be too simplistic in nature to effectively model accurate evolution, where more complex multi-level factors would need to be considered. Along with factor selection, data availability for training purposes and algorithm selection can play a large role in affecting the outcome of the prototype system. A more in-depth and comprehensive coverage of ML

240 algorithms, along with data and hyper parameters, could result in significant improvements in accuracy.

15.3.2.8 Scalability At present the DEEv-MoS prototype system is only capable of simulating a digital environment consisting of a couple thousand entities. To effectively predict and simulate the DE, billions of entities would need to be simulated, placing tremendous strain on the system.

Improvements for the scalability of the simulations are tied to two main factors: the hardware on which the simulation is run and the optimisation thereof. By making use of more powerful hardware and optimising the prototype through code improvements and the removal of live visualisation, much larger digital networks could be simulated and tested. Parallelisation of the tasks and calculations that occur in the DEEP prototype would also yield greater returns in simulation performance.

If these paradigm shifts can be introduced, the long term accuracy of the model and its simulations could be dramatically increased.

15.4 Conclusion In this thesis the problem of simulating and predicting the evolution of the digital environment was addressed through the creation of the DE Evolution Modelling and Simulation model, or DEEv-MoS for short. The DEEv-MoS model combines three well defined and commonly used research areas to create an effective tool for modelling digital environment evolution over time: intelligent agents, entity component systems and machine learning.

The DEEv-MoS model takes into consideration key entities from the DE and models the interaction between them, their growth and their change, simulating how the DE evolves over time through the evolution of its constituents.

Even though the shortcomings and areas for improvement are acknowledged, the DEEP prototype has produced positive results that indicate that it is successful at accomplishing its intended purpose.

241

16 Appendix A: Evolution In the natural world the concept of evolution is well-defined and explored, with scientists theorising on how evolution functions from as early as 1809, when Jean Lamarck, the great French naturalist, proposed a theory where offspring would inherit traits that their parents acquired through adaptation over their lifetime. Later,4 Charles Darwin would develop his famous theory of evolution based on natural selection (1859), which is today known as Darwinian evolution (Shoham & Leyton-Brown, 2008; Millstein, 2019).

Originally, these scientists had no understanding of the biological structures in place that allowed for organisms’ traits to be modified and inherited. However, they were on the right track, and years later, in 1953, Watson and Crick discovered the DNA molecule and the alphabet it consists of: adenine, guanine, thymine and cytosine. Even though the particulars of how DNA mutation and inheritance works is extremely complex and involves probabilistic rules, the high-level process and its constituent steps were clearly understood: initialisation, selection, crossover, mutation and replacement (Wooldridge, 2009; Millstein, 2019).

These fundamental steps describe the process of evolution as we understand it today, determining what needs to occur for a pool of parent entities to generate offspring that, over a period of time, are better suited to a particular environment or task. Over many generations of evolution, the general population can come to possess highly valued traits that were once uncommon, through the process of natural selection. Natural selection is the process by which genes of value for surviving in an environment propagate to later generations due to individuals with less valued or useless genes dying off, thus creating a pool of individuals that, through sexual reproduction, can pass on the valued genes and also develop new traits that may be of further benefit (Yan et al., 2016).

With the dawn of modern computing, experts began investigating ways to use computational power to make use of existing theoretical and scientific concepts. This occurred across scientific fields, such as engineering, mathematics, physics and more. Through this process the field of evolutionary computation arose, defining computational methods that attempt to mimic the biological evolutionary process for the purpose of solving problems (Holland, 1975; Wooldridge, 2009). It quickly became apparent that these methods had great application in optimisation, search and learning problems and, if implemented correctly, could also be used for the purposes of simulation and prediction of outcomes.

When considering the field of evolutionary computation and evolutionary algorithms, the most prominent algorithm type that is used is the genetic algorithm (GA) (Holland, 1984).

4 Recent advances in epigenetics have shown that both Darwinian and Lemarckian evolution are valid to some extent. It was discovered that an offspring’s methylation states, which govern how genes are expressed, can be directly inherited from its parents (Skinner, 2016).

242

GAs model the evolution of a population of individuals based on sexual reproduction, where genes from both parents are used in creating their offspring’s genes (Goodman, 2012). The following subsection takes a more in-depth look at genetic algorithms, how they function, how they can be represented and what applications they have.

16.1.1 Genetic algorithms Genetic algorithms are biologically inspired methods that were created to help solve an array of problems in the evolutionary computation space. Originally used by evolutionary biologists to develop better models for natural evolution, they were later adopted by computer scientists and engineers for function optimisation and search problems, as GAs have strong applications across large problem domains, including learning and adaptation (Shoham & Leyton-Brown, 2008; Hassan et al., 2019).

GAs make use of encoding of the problem space into some form of fixed length string representation, with the earliest and most common being binary strings. This allows for easy reproductive mutational transformations through the recombination of parent genes in some form of crossover (Wooldridge, 2009; Çınaroğlu & Bodur, 2018). The high-level process of GAs follows that of evolutionary algorithms in general: generating an initial population, selecting parents that will reproduce, performing some form of recombination and mutation to create offspring, assessing the fitness of each individual in a population, and killing off individuals with low fitness (Goodman, 2012).

When referring to the ‘fitness’ of any given individual in a population, this refers to the quality of that individual in the given solution space and can be referred to as the ‘objective fitness’ of the individual if the measure of quality is an objective one (Shoham & Leyton- Brown, 2008; Chen, Zhang, Liu & Pan, 2018). By specifying and measuring fitness, it becomes easier to pick parents that may lead to better solutions and, ultimately, eliminate individuals that are of bad quality.

In GAs the initial population that is generated needs to be represented as fixed length strings (normally binary) and can be randomly generated to cover the solution space in a uniform probability distribution. Thereafter, the fitness of individuals is assessed and the fittest individuals are used to reproduce more than less fit individuals i.e. the level of reproduction increases proportionately with the level of fitness of the individual (Shoham & Leyton-Brown, 2008; Hassan et al., 2019). When parents are selected to reproduce, a recombination of their string representations is used to generate a number of offspring (which can be equal to or more than the number of parents), which then undergo randomised mutation to introduce small changes into the population. Once new offspring are generated, the fitness of all individuals is assessed and the least fit individuals are terminated so as to keep the population size in check and allow for later generations to converge on a higher fitness level (Goodman, 2012). Figure 3.6 illustrates the general iterative GA process.

243

Start

Initial Population Generator

Fitness = Fitness P (G)

Crossover Mutation

Reproduction

Selection Process (Natural Replacement to incorporate Selection) new Individual Population

Termination

End

Figure 16.1 The iterative genetic algorithm process (Chen, Zhang, Liu & Pan, 2018).

The exact details of the initial population size, number of parents to choose for reproduction, number of offspring to be reproduced each generation, and how parents’ genes are selected and offspring are mutated, are dependent on the problem being addressed (Wooldridge, 2009; Chen, Zhang, Liu & Pan, 2018). In considering the problem, the solution space can be defined, along with the fitness function to determine individual fitness and the method of parent gene recombination (Goodman, 2012).

Simple GAs became powerful tools for search and learning problems. However, they suffered when used for optimisation problems due to their difficulty in converging on an optimum. GAs were able to quickly locate the area in which a global optimum existed but

244 were slow to actually converge, which lead to a number of new solutions being developed to address this, including using alternative fitness selection mechanisms, dynamically scaling the fitness function, and selecting specialised representations (Shoham & Leyton-Brown, 2008; Hassan et al., 2019).

GAs provide the ability to tackle multiple problem types across a wide domain of problems, making use of the natural evolutionary process to find solutions to problems or to generationally evolve a population for some form of learning. This is an important and powerful tool in the modern computationally driven world, where multiple fields make use of computation to optimise and test models and theories.

245

References Abelson, H. & Sussman, G.J. (1985). Structure and Interpretation of Computer Programs. MIT Press (1985).

Abhigna, P., Jerritta, S., Srinivasan, R. & Rajendran, V. (2017). Analysis of feed forward and recurrent neural networks in predicting the significant wave height at the moored buoys in Bay of Bengal. 2017 International Conference on Communication and Signal Processing (ICCSP), Chennai, 2017, pp. 1856-1860. Accessed 18 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8286717&isnumber=82 86342

Abu-Taieh, E. (2019). Introductory Chapter: Simulation and Modeling, Simulation Modelling Practice and Theory. IntechOpen. Accessed 2 June 2020 from: https://www.intechopen.com/books/simulation-modelling-practice-and- theory/introductory-chapter-simulation-and-modeling

Aggarwal, C.C. (2018). Neural Networks and Deep Learning. Accessed 27 October 2019 from: https://rd.springer.com/book/10.1007/978-3-319-94463-0

Albrecht, S.V. & Stone, P. (2018). Autonomous agents modelling other agents: A comprehensive survey and open problems. Elsevier, Artificial Intelligence. Accessed 13 January 2020 from: https://www.sciencedirect.com/science/article/abs/pii/S0004370218300249?via%3Dihub

Allardice, R. (2019). Network Effects and Metcalfe’s Law. Available from: https://medium.com/@projectubu/network-effects-and-metcalfes-law-b4a4e8ff5767

Alvarado, A.C., Santacruz, E.V. & Zúñiga, M.G. (2017). Construction of a basic intelligent agent. 2017 Intelligent Systems Conference (IntelliSys), London, 2017, pp. 341-345. Accessed 12 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8324315&isnumber=83 24208

Aráuz, J. (2018). Smart Cities and the Dire Need for a Course Correction. 2018 IEEE International Smart Cities Conference (ISC2), Kansas City, MO, USA, 2018, pp. 1-6. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8656829&isnumber=86 56657

Arzhakov, A.V. (2018). Usage of game theory in the internet wide scan. 2018 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Moscow, 2018, pp. 5-8. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8316856&isnumber=83 16851

Beckert, B. (2005). Introduction to Artificial Intelligence: Intelligent Agents. Universität Koblenz-Landau. Accessed 12 March 2019 from: https://formal.iti.kit.edu/~beckert/teaching/Einfuehrung-KI-SS2003/folien02.pdf

246

Benny, D. & Soumya, K.R. (2015). Feed-forward neural network processing speed analysis and an experimental evaluation of Neural Network Frameworks. 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO), Coimbatore, 2015, pp. 1-5. Accessed 19 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7282337&isnumber=72 82219

Bergman, M.M. (2008). Advances in mixed methods research: Theories and applications. Sage Publications.

Bilas, S. (2002). A Data-Driven Game Object System. GDC 2. Slides. Accessed 10 January 2018, from: http://gamedevs.org/uploads/data-driven-game-object-system.pdf

Black, W. (2019). What was the Black Death. Live Science. Accessed 10 January 2020 from: https://www.livescience.com/what-was-the-black-death.html

Boursinos, D. & Koutsoukos, X. (2020). Assurance Monitoring of Cyber-Physical Systems with Machine Learning Components. Cornell University. Accessed on 2 June 2020 from: https://arxiv.org/abs/2001.05014

Brazier, F.M.T, Jonker, C.M, Treur, J. & Wijngaards,N.J.E. (1999). Deliberate evolution in multi-agent systems (extended abstract). In Proceedings of the third annual conference on Autonomous Agents (AGENTS '99), Oren Etzioni, Jörg P. Müller, and Jeffrey M. Bradshaw (Eds.). ACM, New York, NY, USA, 356-357. Accessed on 10 June 2019 from: https://0-dl-acm- org.ujlink.uj.ac.za/citation.cfm?id=301232

Bringmann, B., Berlingerio, M., Bonchi, F. & Gionis, A. (2010). Learning and Predicting the Evolution of Social Networks. DOI: 10.1109/MIS.2010.91. Accessed 7 October 2018, from: https://ieeexplore.ieee.org/document/5552587

Briscoe, B., Odlyzko, A. & Tilly, B. (2006). Metclafe’s Law is Wrong. IEEE Spectrum. Accessed 12 December 2018 from: https://spectrum.ieee.org/computing/networks/metcalfes-law-is- wrong

Brownlee, J. (2019). What is Deep Learning. Machine Learning Mastery. Accessed 10 January 2019 from: https://machinelearningmastery.com/what-is-deep-learning/

Bruckman, A., Erickson, T., Kellogg, W., Sproull, L. & Wellman, B. (1999). Research issues in the design of online communities. In CHI '99 Extended Abstracts on Human Factors in Computing Systems (CHI EA '99). ACM, New York, NY, USA, 166-166. Accessed 24 February 2019 from: https://0-doi-org.ujlink.uj.ac.za/10.1145/632716.632817

Bugayenko, Y. (2017). Elegant Objects. Accessed 11 January 2020 from: https://www.yegor256.com/elegant-objects.html

Buttfield-Addison, P., Geldard, M. & Nugent, T. (2019). Entity Component Systems and You: They’re not just for game developers. O’Reilly Software Architecture Conference, 2019. Accessed 8 January 2020 from: https://conferences.oreilly.com/software-architecture/sa- ny-2019/public/schedule/detail/71964

247

Caini, M. (2019). ECS Back and Forth. Accessed 9 January 2020 from: https://skypjack.github.io/2019-02-14-ecs-baf-part-1/

Case, R. (2019). Machine Learning, Part 4: Reinforcement Learning. Towards Data Science. Accessed 12 January 2020 from: https://towardsdatascience.com/machine-learning-part-4- reinforcement-learning-43070cbd83ab

Cerf, V., Dalal, Y. & Sunshine, C. (1974). Specification of Internet Transmission Control Program. Accessed 12 January 2020 from: https://tools.ietf.org/html/rfc675

Chaffey, D. (2018). Mobile Marketing Statistics compilation. Accessed 6 October 2018, from: https://www.smartinsights.com/mobile-marketing/mobile-marketing-analytics/mobile- marketing-statistics/

Chayko, M. (2017). Superconnected: The Internet, Digital Media, and Techno-Social Life. Sage Publications.

Chen, C., Du, Y., Chen, S. & Wang, P. (2018). Partitioning and Placing Virtual Machine Clusters on Cloud Environment. 2018 1st International Cognitive Cities Conference (IC3), Okinawa, 2018, pp. 268-270. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8567224&isnumber=85 67146

Chen, J., Zhang, D., Liu, D. & Pan, Z. (2018). A Network Selection Algorithm Based on Improved Genetic Algorithm. 2018 IEEE 18th International Conference on Communication Technology (ICCT), Chongqing, 2018, pp. 209-214. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8600265&isnumber=85 99878

Cherry, S. (2004). Edholm’s Law of Bandwidth. Accessed 17 December 2019 from: https://spectrum.ieee.org/telecom/wireless/edholms-law-of-bandwidth

Çınaroğlu, S. & Bodur, S. (2018). A new hybrid approach based on genetic algorithm for minimum vertex cover. 2018 Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, 2018, pp. 1-5. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8466307&isnumber=84 66260

Cui, Y., Médard, M., Yeh, E., Leith, D., Lai, F. & Duffy, K.R. (2017). A Linear Network Code Construction for General Integer Connections Based on the Constraint Satisfaction Problem. in IEEE/ACM Transactions on Networking, vol. 25, no. 6, pp. 3441-3454, Dec. 2017. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8061034&isnumber=82 14923

Creswell, J.W. (2014). Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (4th ed.). Sage Publications.

Council of Europe. (2016). The digital environment. Accessed 2 July 2018, from: https://www.coe.int/en/web/children/the-digital-environment

248

DailyHistory.org (2018). How did universities develop? Accessed 14 December 2019 from: https://dailyhistory.org/How_did_universities_develop%3F

Deziel, C. (2018). How to Calculate Delta Between Two Numbers. Accessed 13 January 2020 from: https://sciencing.com/calculate-midpoint-between-two-numbers-2807.html

Dickheiser, M. (2006). Game Programming Gems 6. Charles River Media, Inc. Accessed 12 November 2019 from: https://dl.acm.org/doi/book/10.5555/1121740

Di Sciascio, C., Sabol, V. & Veas, E. (2017). Supporting Exploratory Search with a Visual User- Driven Approach. ACM Trans. Interact. Intell. Syst. 7, 4, Article 18 (December 2017), 35 pages. Accessed 16 January 2019 from: https://0-doi-org.ujlink.uj.ac.za/10.1145/3009976

Donzellini, G., Oneto, L., Ponta, D. & Anguita, D. (2019). Introduction to Digital Systems Design. Springer.

Elhishi, S., Abu-Elkheir, M. & Elfetouh, A.A. (2019) Perspectives on the evolution of online communities, Behaviour & Information Technology, 38:6, 592-608. Accessed 24 February 2019 from: https://doi.org/10.1080/0144929X.2018.1546901

Elshaikh, E.M. (2016). Social, political, and environmental characteristics of early civilizations. Accessed 13 January 2019 from: https://www.khanacademy.org/humanities/world-history/world-history-beginnings/birth- agriculture-neolithic-revolution/a/why-did-human-societies-get-more-complex

El-Taweel, N.A. & Farag, H.E.Z. (2018). Voltage Regulation in Islanded Microgrids Using Distributed Constraint Satisfaction. in IEEE Transactions on Smart Grid, vol. 9, no. 3, pp. 1613-1625, May 2018. Accessed 12 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7524681&isnumber=83 41523

Epskamp, S., Cramer, A.O.J., Waldorp, L.J, Schmittmann, V.D. & Borsboom, D. (2012). qgraph: Network Visualizations of Relationships in Psychometric Data. Accessed 21 January 2020 from: https://www.jstatsoft.org/article/view/v048i04

Ertuğrul, O.F., Tekin, R. & Kaya, Y. (2017). Randomized feed-forward artificial neural networks in estimating short-term power load of a small house: A case study. 2017 International Artificial Intelligence and Data Processing Symposium (IDAP), Malatya, 2017, pp. 1-5. Accessed 17 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8090344&isnumber=80 90153

Foote, K. (2019). A Brief History of Machine Learning. Dataversity. Accessed 20 January 2020 from: https://www.dataversity.net/a-brief-history-of-machine-learning/

Frömming, U., Köhn, S., Fox, S., et al. (2017). Digital Environments and the Future of Ethnography: An Introduction. Digital Environments (pp. 13-22). Bielefeld: transcript Verlag. Accessed 24 July 2018 from: https://www.degruyter.com/view/books/9783839434970/9783839434970- 002/9783839434970-002.xml

249

Gabrielli, M. & Martini, S. (2010). Programming Languages: Principles and Paradigms. Springer (2010).

Garro, A., Mühlhäuser, .M, Tundis, A., Mariani, S., Omicini, A. & Vizzari, G. (2018). Intelligent Agents and Envrionment. Reference Module in Life Sciences. Accessed 12 January 2020 from: https://www.researchgate.net/publication/323988067_Intelligent_Agents_and_Environme nt

Goodfellow, I., Bengio, Y. & Courville, A. (2016). Deep Learning. MIT Press. Accessed 15 June 2019, from: http://www.deeplearningbook.org/

Goodman, E.D. (2012). Introduction to genetic algorithms. In Proceedings of the 14th annual conference companion on Genetic and evolutionary computation (GECCO '12), Terence Soule (Ed.). ACM, New York, NY, USA, 671-692. Accessed 12 July 2019, from: https://0-dl- acm-org.ujlink.uj.ac.za/citation.cfm?id=2330911

Gordon, R. (2003). Convergence Defined. USC Annenberg Online Journalism Review. Accessed 24 January 2019 from: http://www.ojr.org/ojr/business/1068686368.php

Guo, X., Dutta, R.G. & Jin, Y. (2017). Eliminating the Hardware-Software Boundary: A Proof- Carrying Approach for Trust Evaluation on Computer Systems. IEEE Transactions on Information Forensics and Security, vol. 12, no. 2, pp. 405-417, Feb. 2017. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7723866&isnumber=77 50660

Gupta, T. (2017). Deep Learning: Feedforward Neural Network. Towards Data Science. Accessed 16 January 2020 from: https://towardsdatascience.com/deep-learning- feedforward-neural-network-26a6705dbdc7

Harkonen, T. (2019). Advantages and Implementation of Entity-Component-Systems. Bachelor of Science Thesis, Tampere University. April 2019. Accessed 10 January 2020 from: https://trepo.tuni.fi/bitstream/handle/123456789/27593/H%C3%A4rk%C3%B6nen.pdf?seq uence=4&isAllowed=y

Haubelt, C., Teich, J., Richter, K. & Ernst, R. (2002). System design for flexibility, in Proc. Conf. Des. Autom. Test Eur., Washington, DC, 2002, pp. 854–861.

Halinka, A., Rzepka, P. & Szablicki, M. (2015). Agent model of multi-agent system for area power system protection. 2015 Modern Electric Power Systems (MEPS), Wroclaw, 2015, pp. 1-4. Accessed 12 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7477185&isnumber=74 77147

Hassan, M.u., Ali, S. & Mahmood, K. (2019). Genetic Algorithm VS Simulated Evolution: A Comparative Study of Evolutionary Optimization Techniques for Object Recognition. 2019 International Conference on Computer and Information Sciences (ICCIS), Sakaka, Saudi Arabia, 2019, pp. 1-4. Accessed 13 January 2020 from: http://0-

250 ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8716445&isnumber=87 16377

Hecht, J. (2016). Is Keck’s Law Coming to an End? Accessed 15 December 2019 from: https://spectrum.ieee.org/semiconductors/optoelectronics/is-kecks-law-coming-to-an-end

Heller, M. (2020). Deep learning vs. machine learning: Understand the differences. InfoWorld. Accessed 20 January 2020 from: https://www.infoworld.com/article/3512245/deep-learning-vs-machine-learning- understand-the-differences.html

Hevner, A.R., March, S.T., Park, J. & Ram, S. (2004). Design Science in Information Systems Research. Management Information Systems Quarterly, 28(1), 75-105. Accessed 12 May 2020 from: https://www.researchgate.net/publication/201168946_Design_Science_in_Information_Sys tems_Research

Hintze, D., Hintze, P., Findling, R.D. & Mayrhofer, R. (2017). A Large-Scale, Long-Term Analysis of Mobile Device Usage Characteristics. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 2, Article 13 (June 2017), 21 pages. Accessed 8 February 2019: https://0-doi-org.ujlink.uj.ac.za/10.1145/3090078

Holland, J.H. (1975). Adaptation in Natural and Artificial Systems. MIT Press. Accessed 13 January 2020 from: https://mitpress.mit.edu/books/adaptation-natural-and-artificial- systems

Holland, J.H. (1984). Genetic Algorithms and Adaptation. In: Selfridge O.G., Rissland E.L., Arbib M.A. (eds) Adaptive Control of Ill-Defined Systems. NATO Conference Series (II Systems Science), vol 16. Springer, Boston, MA. Accessed 13 January 2020 from: https://doi.org/10.1007/978-1-4684-8941-5_21

Horvath, I. & Gerritsen, B.H.M. (2012). Cyber-Physical Systems: Concepts, Technologies and Implementation Principles. TMCE 2012. Accessed on 2 June 2020 from: https://www.researchgate.net/publication/229441298_CYBER- PHYSICAL_SYSTEMS_CONCEPTS_TECHNOLOGIES_AND_IMPLEMENTATION_PRINCIPLES

Huang, G.B., Zhu, Q.U. & Siew, C.K. (2006). Extreme learning machine: Theory and applications. Accessed 21 June 2018, from: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.217.3692

Huang, G.B. & Chen, L. (2008). Enhanced random search based incremental extreme learning machine. Accessed 20 January 2020 from: https://www.ntu.edu.sg/home/egbhuang/pdf/EI-ELM.pdf

Humphries, M.D. & Gurney, K. (2008). Network ‘Small-World-Ness’: A Quantitative Method for Determining Canonical Network Equivalence. Accessed 18 January 2020 from: https://journals.plos.org/plosone/article/authors?id=10.1371/journal.pone.0002051

Ishida, T. (2017). Digital City, Smart City and Beyond. In Proceedings of the 26th International Conference on World Wide Web Companion (WWW '17 Companion). International World Wide Web Conferences Steering Committee, Republic and Canton of

251

Geneva, Switzerland, 1151-1152. Accessed 6 February 2019 from: https://0-doi- org.ujlink.uj.ac.za/10.1145/3041021.3054710

Jain, P., Sharma, A. & Ahuja, L. (2018). The Impact of Agile Software Development Process on the Quality of Software Product. 2018 7th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO), Noida, India, 2018, pp. 812-815. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8748529&isnumber=87 48266

Jain, S., Mohan, G. & Sinha, A. (2017). Network diffusion for information propagation in online social communities. 2017 Tenth International Conference on Contemporary Computing (IC3), Noida, 2017, pp. 1-3. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8284358&isnumber=82 84279

Jiaramaneepinit, B. & Nuthong, C. (2018). Extended Extreme Learning Machine: A Novel Framework for Neural Network. 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 2018, pp. 1629-1634. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8616278&isnumber=86 15655

Johnson, C. (2006). What are Emergent Properties and How Do They Affect the Engineering of Complex Systems? Reliable Engineering: System Safety, 91(12). Accessed 30 May 2020 from: https://www.researchgate.net/publication/228357823_What_are_Emergent_Properties_an d_How_Do_They_Affect_the_Engineering_of_Complex_Systems

Juliani, A. (2017). Introducing: Unity Machine Learning Agents Toolkit. Unity 3D. Accessed 16 January 2020 from: https://blogs.unity3d.com/2017/09/19/introducing-unity-machine- learning-agents/?_ga=2.245254405.1888743624.1580168489-1232502507.1564838018

Khan, N.N. (2018). History and Evolution of Technology. Accessed 20 May 2019 from: https://nation.com.pk/23-Jul-2018/history-and-evolution-of-technolo- gy#:~:targetText=The%20history%20of%20technology%20is,advancement%20and%20chang es%20around%20us.

Khosrow-Pour, M. (2017). Encyclopedia of Information Science and Technology, Fourth Edition. Accessed January 26 2019 from: https://www.igi-global.com/book/encyclopedia- information-science-technology-fourth/173015

Klein, S.T. (2016). Basic concepts in Data Structures. Cambridge University Press (2016).

Krishnamurthi, S & Fisler, K. Programming Paradigms and Beyond. Accessed 20 January 2020 from: https://cs.brown.edu/~sk/Publications/Papers/Published/kf-prog-paradigms-and- beyond/paper.pdf

Kuhrmann, M., Diebold, P., Munch, J., Tell, P., Trektere, K., McCaffery, F., Garousi, V., Felderer, M., Linssen, O., Hanser, E. & Prause, C.R. (2019). Hybrid Software Development

252

Approaches in Practice: A European Perspective. IEEE Software, vol. 36, no. 4, pp. 20-31, July-Aug. 2019. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8254323&isnumber=87 38080

Kumar, U., Kim, J. & Helmy, A. (2013). Comparing wireless network usage: laptop vs smart- phones. In Proceedings of the 19th annual international conference on Mobile computing & networking (MobiCom '13). ACM, New York, NY, USA, 243-246. Accessed 8 February 2019 from: http://0-dx.doi.org.ujlink.uj.ac.za/10.1145/2500423.2504586

Kumar, A.H. & Suresh, Y. (2016). Multilayer feed forward neural network to predict the speed of wind. 2016 International Conference on Computation System and Information Technology for Sustainable Solutions (CSITSS), Bangalore, 2016, pp. 285-290. Accessed 17 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7779372&isnumber=77 79360

Kumar, N. (2019). Deep Learning: Feedforward Neural Networks Explained. Hackernoon.com. Accessed 18 January 2020 from: https://hackernoon.com/deep-learning- feedforward-neural-networks-explained-c34ae3f084f1

Kumara, V. (2019). A quick primer on feedforward neural networks. Builtin.com. Accessed 18 January 2020 from: https://builtin.com/data-science/feedforward-neural-network-intro

Laifa, M., Akrouf, S. & Maamri, R. (2015). An Overview of Forgiveness in The Digital Environment. In Proceedings of the International Conference on Intelligent Information Processing, Security and Advanced Communication (IPAC '15), ACM, New York, NY, USA, Article 38 , 5 pages. DOI: 10.1145/2816839.2816891. Accessed on 10 October 2018, from: https://0-dl-acm-org.ujlink.uj.ac.za/citation.cfm?id=2816891

Landy, D., Allen, C. & Zednik, C. (2014). A perceptual account of symbolic reasoning. Accessed 13 January 2020 from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4001060/

Lee, H. (2018). Research on the impact of technology taxonomy for the tracking of technology convergence. 2018 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, 2018, pp. 1452-1456. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8539349&isnumber=85 39346

Lee, H. & Zo, H. (2016). R&D allies: How they impact technology convergence in the area of ICT. 2016 International Conference on Information and Communication Technology Convergence (ICTC), Jeju, 2016, pp. 340-343. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7763492&isnumber=77 63035

Lehmann, T.C., Rolfsen, J.A. & Clark, T.D. (2015). Predicting the trajectory of the evolving international cyber regime: Simulating the growth of a social network. Elsevier. Accessed 19 February 2019 from: http://www.sciencedirect.com/science/article/pii/S0378873315000040

253

Leiner, B.M., Cerf, V.G., Clark, D.D., Kahn, R.E., Kleinrock, L., Lynch, D.C., Postel, J., Roberts, L.G. & Wolff, S. (1997). Brief History of the Internet. Accessed 24 March 2019 from: https://www.internetsociety.org/internet/history-internet/brief-history-internet/

Li, M. & Zhang, X. (2016). A Modified more Rapid Sequential Extreme Learning Machine. 2016 8th International Conference on Computational Intelligence and Communication Networks (CICN), Tehri, 2016, pp. 336-340. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8082662&isnumber=80 82586

Lima, G.L.B., Ferreira, G.A.L., Saotome, O., Cunha, A.M.d. & Dias, L.A.V. (2015). Hardware Development: Agile and Co-Design. 2015 12th International Conference on Information Technology - New Generations, Las Vegas, NV, 2015, pp. 784-787. Accessed 10 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7113581&isnumber=71 13432

Linturi, R., Koivunen, M. & Sulkanen, J. (2000). Helsinki Arena 2000 - Augmenting a real city to a virtual one. Digital Cities: Experiences, Technologies and Future Perspectives, Springer- Verlag, pp. 83-96, 2000. Accessed 23 Febraury 2019 from: http://www.linturi.fi/HelsinkiArena2000/

Liskov, B. (1987). Data abstraction and Hierarchy. In Addendum to the proceedings on Object-oriented programming systems, languages and applications. ACM, New York, NY, USA, 17-34. DOI: http://dx.doi.org/10.1145/62138.62141. Accessed 15 August 2019, from: https://dl.acm.org/citation.cfm?id=62141

Liu, D., Xu, X. & Long, Y. (2017). On member search engine selection using artificial neural network in meta search engine. 2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS), Wuhan, 2017, pp. 865-868. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7960113&isnumber=79 59951

Loeffler, J. (2018). No More Transistors: The End of Moore’s Law. Accessed 19 December 2019 from: https://interestingengineering.com/no-more-transistors-the-end-of-moores-law

Lord, R. (2012). Why use an Entity Component System architecture for game development? Available from: https://www.richardlord.net/blog/ecs/why-use-an-entity-framework.html

Lu, M. & Liu, L. (2019). Synchronization of a Class of Nonlinear Multi-Agent Systems. 2019 3rd International Symposium on Autonomous Systems (ISAS), Shanghai, China, 2019, pp. 429-433. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8757768&isnumber=87 57696

Majeed, A. (2017). Technology Diffusion and Virtualisation of Virtual Communities. In Proceedings of the 8th International Conference on E-Education, E-Business, E-Management and E-Learning (IC4E '17). ACM, New York, NY, USA, 81-85. Accessed 15 January 2019 from: https://0-doi-org.ujlink.uj.ac.za/10.1145/3026480.3026494

254

Mao, L., Li, Y. & Mao, Y. (2018). Improved Extreme Learning Machine Based on Artificial Bee ColonyAlgorithm. 2018 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), Wuxi, 2018, pp. 178-180. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8572551&isnumber=85 72498

Martin, A. (2007). Entity Systems are the future of MMOG development. Accessed 19 August 2019, from: http://t-machine.org/index.php/2007/09/03/entity-systems-are-the- future-of-mmog-development-part-1/

Martin, R.C. (2009). Clean Code, A Handbook of Agile Software Craftmanship. O’Reilly (2009).

Mikhajlov, L. & Sekerinski, E. (1997). The Fragile Base Class Problem and Its Solution. Accessed 15 January 2020 from: https://www.researchgate.net/publication/2827765_The_Fragile_Base_Class_Problem_and _Its_Solution

Millstein, R.L. (2019). Evolution. The Stanford Encyclopedia of Philosophy (Summer 2019 Edition), Edward N. Zalta (ed.). Accessed 13 January 2020 from: https://plato.stanford.edu/archives/sum2019/entries/evolution/

Mitchell, T.M. (1997) Machine Learning. McGraw-Hill, Inc., New York.

Moore, G.E. (1965). Cramming more components onto integrated circuits. Electronics, Volume 38, Number 8. Accessed 20 December 2019, from: https://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf

Newton, J. (2018). Evolutionary Game Theory: A Renaissance. Institute of Economic Research, Kyoto University, Japan. Accessed 13 January 2020 from: https://www.mdpi.com/2073-4336/9/2/31

Nielsen, M.A. (2015). Neural Networks and Deep Learning. Determination Press.

Nystrom, R. (2014). Game Programming Patterns. Genever Benning. Accessed 13 January 2020 from: http://gameprogrammingpatterns.com/

Olivier, M.S. (2009). Information Technology Research: A Practical Guide for Computer Science and Informatics (3rd ed.). Pretoria: Van Schaik.

Page, L. (1997). PageRank: Bringing Order to the Web. Accessed 12 January 2020 from: http://www-diglib.stanford.edu/diglib/WP/PUBLIC/DOC159.html

Pahwa, K. & Agarwal, N. (2019). Stock Market Analysis using Supervised Machine Learning. 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 2019, pp. 197-200. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8862225&isnumber=88 62171

255

Pan, S. & Duraisamy, K. (2018). Long-Time Predictive Modelling of Nonlinear Dynamical Systems Using Neural Networks. Complex Algorithms for Data-Driven Model Learning in Science and Engineering, volume 2018. Accessed 2 June 2020 from: https://www.hindawi.com/journals/complexity/2018/4801012/

Pascual-Garcia, A. (2018). A constructive approach to the epistemological problem of emergence in complex systems. Accessed 2 June 2020 from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0206489

Patel, K. & Mehta, A. (2018). Discrete-time Sliding Mode Control for Leader Following Discrete-time Multi-Agent System. IECON 2018 - 44th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, 2018, pp. 2288-2292. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8591273&isnumber=85 91058

Patnam, A.B. & El Taeib, T. (2016). Usage of computer networking in the real time network laboratory. 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT), Farmingdale, NY, 2016, pp. 1-6. Accessed 11 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7494158&isnumber=74 94094

Peffers, K., Rothenberger, M., Tuunanen, T. & Chatterjee, S. (2007). A Design Science Research Methodology for Information Systems Research. Journal of Management Information Systems, 24(3), 45-77. Accessed on 12 May 2020 from: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.535.7773&rep=rep1&type=pdf

Peffers, K., Rothenberger, M., Tuunanen, T. & Vaezi, R. (2012). Design science research evaluation, In: Design science research in information systems. Advances in theory and practice, p. 398-410, Springer Berlin Heidelberg. Accessed 5 October 2018, from: http://scholar.google.co.za/scholar_url?url=http://www.sirel.fi/ttt/Downloads/Design%252 0Science%2520Research%2520Methodology%25202008.pdf&hl=en&sa=X&scisig=AAGBfm0 2C4lHRCZBUnA2MohcWYT1tYdM7g&nossl=1&oi=scholarr

Qiu, L. & Li, K. (2016). The Research of Intelligent Agent System Architecture Based on Cloud Computing. 2016 12th International Conference on Computational Intelligence and Security (CIS), Wuxi, 2016, pp. 693-696. Accessed 12 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7820558&isnumber=78 20391

Rafiquzzaman, M. (2014). Fundamentals of Digital Logic and Microcontrollers. 6th Ed. Wiley.

Ray, S. (2019). A Quick Review of Machine Learning Algorithms. 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), Faridabad, India, 2019, pp. 35-39. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8862451&isnumber=88 62171

256

Roser, M., Ritchie, H. & Ortiz-Ospina, E. (2019). Internet. Published online at OurWorldInData.org. Accessed 6 September 2019, from: https://ourworldindata.org/internet

Roser, M., Ritchie, H. & Ortiz-Ospina, E. (2019). World Population Growth. Published online at OurWorldInData.org. Accessed 6 September 2019, from: https://ourworldindata.org/world-population-growth

Russell, S.J. & Norvig, P. (2010). Artificial Intelligence: A Modern Approach (3rd ed.). Pearson Education.

Saganowski, S. (2015). Predicting community evolution in social networks. 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, 2015, pp. 924-925. Accessed 15 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7403656&isnumber=74 03513

Santos, E. & Zhao, Y. (2017). Automatic Emergence Detection in Complex Systems. Complexity, volume 2017. Accessed 30 May 2020 from: https://www.hindawi.com/journals/complexity/2017/3460919/

Shai, S. & Shai, B. (2014). Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, New York, NY, USA.

Sharma, K. & Nandal, R. (2019). A Literature Study On Machine Learning Fusion With IOT. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 2019, pp. 1440-1445. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8862656&isnumber=88 62508

Shi, Y., Sagduyu, Y. & Grushin, A. (2017). How to steal a machine learning classifier with deep learning. 2017 IEEE International Symposium on Technologies for Homeland Security (HST), Waltham, MA, 2017, pp. 1-5. Accessed 14 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7943475&isnumber=79 43437

Shoham, Y. & Leyton-Brown, K. (2008). Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations. Cambridge University Press, New York, NY, USA.

Skinner, M. (2009). Unified theory of evolution. Aeon. Accessed 03 January 2020 from: https://aeon.co/essays/on-epigenetics-we-need-both-darwin-s-and-lamarck-s-theories

Skoudis, E. (2009). Evolutionary trends in cyberspace. Cyberpower and National Security, pp.147-170. Accessed 13 March 2019 from: http://ctnsp.dodlive.mil/files/2014/03/Cyberpower-I-Chap-06.pdf

Smola, A. & Vishwanathan, S.V.N. (2008). Introduction to Machine Learning. Cambridge University Pres (2008).

Sönmez, Y., Tuncer, T., Gökal, H. & Avcı, E. (2018). Phishing web sites features classification based on extreme learning machine. 2018 6th International Symposium on Digital Forensic

257 and Security (ISDFS), Antalya, 2018, pp. 1-5. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8355342&isnumber=83 55307

Srivastava, P. & Hopwood, N. (2009). A Practical Iterative Framework for Qualitative Data Analysis. International Journal of Qualitative Methods. 8. 76-84. DOI: 10.1177/160940690900800107. Accessed 8 October 2018, from: https://www.researchgate.net/publication/215472971_A_Practical_Iterative_Framework_f or_Qualitative_Data_Analysis

Suarez, F.J., Nuno, P., Granda, J.C. & Garcia, D.F. (2015). Chapter 7: Computer networks performance modeling and simulation. Modeling and Simulation of Computer Networks and Systems: Methodologies and Applications. Accessed 30 May 2020 from: https://www.sciencedirect.com/science/article/pii/B9780128008874000079

Svozil, D., Kvasnicka, V. & Pospichal, J. (1997). Introduction to multi-layer feed-forward neural networks. Chemometrics and Intelligent Laboratory Systems. Volume 39, Issue 1, 1997, Pages 43-62. Accessed 19 January 2020 from: https://www.sciencedirect.com/science/article/abs/pii/S0169743997000610

Teich, J. (2012). Hardware/Software Codesign: The Past, the Present, and Predicting the Future, in Proceedings of the IEEE, vol. 100, no. Special Centennial Issue, pp. 1411-1430, 13 May 2012. doi: 10.1109/JPROC.2011.2182009. Accessed 15 November 2018 from: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6172642&isnumber=6259910

Telesford, Q.K., Joyce, K.E., Hayasaka, S., Burdette, J.H. & Laurrienti, P.J. (2011). The Ubiquity of Small-World Netowrks. Accessed 19 February 2019 from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3604768/

Tenenbaum, J., Gan, C., Yi, K., Torralba, A. & Kohli, P. (2018). Neural-Symbolic VQA: Disentangling Reasoning from Vision and Language Understanding. Advances in Neural Information Processing Systems 31 (NIPS 2018). Accessed 13 January 2020 from: http://papers.nips.cc/paper/7381-neural-symbolic-vqa-disentangling-reasoning-from-vision- and-language-understanding

The R Foundation (2020). What is R? Accessed 15 September 2020, from: https://www.r- project.org/about.html

The Tran, V., Eklund, P. & Cook, C. (2013). Evolutionary simulation for a public transit digital ecosystem: a case study. In Proceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems (MEDES '13). ACM, New York, NY, USA, 25-32. DOI: 10.1145/2536146.2536155. Accessed 10 October 2018, from: https://0-dl-acm- org.ujlink.uj.ac.za/citation.cfm?id=2536155

Tucker, A.B. & Noonan, R. (2007). Programing Languages: Principles and Paradigms. McGraw-Hill (2007).

Unity Technologies (2019). Systems. Accessed 6 October 2019 from: https://docs.unity3d.com/Packages/[email protected]/manual/scripting-systems.html

258

Van Roy, P. (2009). Programming Paradigms for Dummies: What Every Programmer Should Know. Accessed 16 August 2019, from: https://www.researchgate.net/publication/241111987_Programming_Paradigms_for_Dum mies_What_Every_Programmer_Should_Know

Van Veen, F. & Leijnen, S. (2019). The Neural Network Zoo. Accessed 13 January 2020 from: https://www.asimovinstitute.org/neural-network-zoo/

Victor, A., Viorica, S., Munteanu, S., Dimitrie, B., Dmitri, C., Ana, N. & Sergiu, D. (2018). Multi-agent cognitive system for optimal solution search. 2018 International Conference on Development and Application Systems (DAS), Suceava, 2018, pp. 53-56. Accessed 12 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8396070&isnumber=83 96056

Walter, C. (2005). Kryder’s Law. Scientific American, vol 293. Accessed 11 January 2020 from: https://www.scientificamerican.com/article/kryders-law/

Weiss, G. (Ed.). (2013). Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge, MA, USA.

West, M. (2007). Evolve Your Hierarchy. Accessed 12 December 2019 from: http://cowboyprogramming.com/2007/01/05/evolve-your-heirachy/

Weyns, D. & Michel, F. (2014). Agent Environments for Multi-agent Systems: A Research Roadmap. 4th International Workshop on Agent Environments for Multi-agent Systems, IV. Accessed 11 January 2020 from: https://doi.org/10.1007/978-3-319-23850-0_1

Winch, C., Todd, M., Baker, I., Blain, J. & Smith, K. (2005). Methodologies. Accessed 7 October 2018, from: http://www.socscidiss.bham.ac.uk/methodologies.html

Wisskirchen, P. (1996). Object-oriented and Mixed Programming Paradigms. Springer (1996).

Wittrock, B. (2001). Disciplines, History of, in the Social Sciences. International Encyclopedia of the Social & Behavioral Sciences, 2001, pp. 3721-3728. Accessed 14 November 2018 from: https://doi.org/10.1016/B0-08-043076-7/00059-0

Wooldridge, M. (2009). An Introduction to Multiagent Systems (2nd ed.). Wiley Publishing.

Worldometers (2019). World Population. Accessed 19 December 2019 from: https://www.worldometers.info/world-population/

Wu, T. & Chen, L. (2016). Predicting the evolution of complex networks via local information. DOI: 10.1016/j.physa.2016.08.013. Accessed 7 October 2018, from: https://www.sciencedirect.com/science/article/pii/S0378437116305295?via%3Dihub

Yan, J., Zhang, C. & Meng, Y. (2016). An all-in-one model: Computer simulation of population genetics and evolution under Hardy-Weinburg conditions. 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, 2016, pp. 1331-1334. Accessed 13 January 2020 from: http://0-

259 ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7924920&isnumber=79 24647

Yiu, T. (2019). Understanding Neural Networks. Towards Data Science. Accessed 9 January 2020 from: https://towardsdatascience.com/understanding-neural-networks- 19020b758230

Yu, H., Yang, X., Zheng, S. & Sun, C. (2019). Active Learning From Imbalanced Data: A Solution of Online Weighted Extreme Learning Machine. in IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 4, pp. 1088-1103, April 2019. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=8443399&isnumber=86 68600

Zhang, L. & Shang, J. (2016). Understanding the Educational Values of Video Games from the Perspective of Situated Learning Theory and Game Theory. 2016 International Conference on Educational Innovation through Technology (EITT), Tainan, 2016, pp. 76-80. Accessed 13 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7839497&isnumber=78 39474

Zhang, P., Zhao, S. & Wang, X. (2015). The failure analysis of extreme learning machine on big data and the counter measure. 2015 International Conference on Machine Learning and Cybernetics (ICMLC), Guangzhou, 2015, pp. 849-853. Accessed 20 January 2020 from: http://0- ieeexplore.ieee.org.ujlink.uj.ac.za/stamp/stamp.jsp?tp=&arnumber=7340664&isnumber=73 40593

Zimmermann, K.A. (2017). History of Computers: A Brief Timeline. Live Science. Accessed 28 November 2019 from: https://www.livescience.com/20718-computer-history.html

260