See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/338014968

Advanced Deterministic Optimization Algorithm for Deep Learning Artificial Neural Networks

Preprint · December 2019 DOI: 10.13140/RG.2.2.33006.77127

CITATIONS READS 0 271

1 author:

Jamilu Auwalu Adamu National Mathematical Centre

33 PUBLICATIONS 89 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

SUPERINTELLIGENT DEEP LEARNING ARTIFICIAL NEURAL NETWORKS View project

Artificial Neural Network View project

All content following this page was uploaded by Jamilu Auwalu Adamu on 18 December 2019.

The user has requested enhancement of the downloaded file. Advanced Deterministic Optimization Algorithm for Deep Learning Artificial Neural Networks

Jamilu Auwalu Adamu Mathematics Programme, 118 National Mathematical Centre, 904105, FCT-Abuja, Nigeria Correspondence: Mathematics Programme Building, 118 National Mathematical Centre, Small Sheda, Kwali, FCT- Abuja, Nigeria. Tel: +2348038679094. E-mail: [email protected]

Received: November 26, 2019 Accepted: December 10, 2019 Online Published: XX, 2019

Abstract

The existing choices of Activation Functions of a deep learning neural network are majorly based on personal human judgments, biases, experiences and little quantitative skills, thus, neither generated from the training data, testing data nor emanated from the referenced AI-MI-Purified Data Set. In my previous paper, Jameel’s ANNAF Stochastic Criterion and Lemma for selecting stochastic activation functions were proposed, however, the objective of this paper is to propose Definite Rules, not Trial and Error called “Jameel’s ANNAF Deterministic Criterion and Lemma” for the choice of advanced optimized Activation Functions. This is the only paper that first applied “proposed Jameel’s ANNAF Deterministic Criterion” to proposed about Two-Thousand Two-Hundred and Twenty Four, 2224 Advanced Activation Functions (mostly Deterministic) EMANATED from our AI SAMPLE DATA for the successful conduct of Deep Learning Artificial Neural Network. THREE out of which were Rated excellent Activation Functions for “Temperature vs Conductance” Deep Learning Artificial Neural Network. However, one can still find more candidates out of the remaining 2221 Activation Functions using the proposed criterion. The bottom line is that Advanced Deep Learning Artificial Neural Networks would depend on AI DATA, TIME CHANGE and the AREA OF APPLICATION.

Keywords: Jameel’s ANNAF Deterministic Criterion, AI-ML-Purified Data, Activation Functions, TableCurve 2D, Derivative Calculator, Criterion

1. Introduction

Casper Hansen (2019) says “Better optimized neural network; choose the right activation function, and your neural network can perform vastly better”.

Artist Hans Hoffman wrote, “The ability to simplify means to eliminate the unnecessary so that the necessary may speak.”

Taking close look at the existing set of Activation Functions and the Deep Learning Neural Network structure, it is a system that made up of both probabilistic and non-probabilistic (deterministic) functions, (please, see https://en.wikipedia.org/wiki/Activation_function). The current beliefs and practice in the academia, decision- makers, and professionals, one can use both probabilistic and deterministic Activation Functions in a Neural Network system as stated by the different opinions of members of Researchgate as at 6th June, 2019 that :

“Right now I am using sigmoidal function as an activation function for the last layer and it is giving me output in the range of 0 to 1 which is obvious. So my question is whether I should use another function as an activation function in the last layer?”. Responses:“the most appropriate activation function for the output neuron(s) of a feedforward neural network used for regression problems (as in your application) is a linear activation, even if you first normalize your data.,“Yes you can use a linear function as activation function of the last layer”,“The most exact and accurate prediction of neural networks is made using tan-sigmoid function for hidden layer neurons and purelin

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 1 function for output layer”, “You should normalize your dataset in [-1,1] range first . Then, for function approximation (as in your case) I agree with Ali Naderi and you better use tansig (for hidden layers) and purelin (for output layer). However, for classification tasks you better use tansig everywhere (for hidden as well as output layers)”, “regarding the activation function of hidden layer it is any sigmoid function except in the case of some bottleneck neurons in which case a hidden layer neuron has a linear function...thanks and regards...”,“You should use purelin Linear transformation function on the last layer of your network. “,“It is better to use sigmoid activation function both for the hidden layer and last layer neuron in order to get accurate results”, “if the input and output mapping is nonlinear, then use logistic function at the output layer, and you can still use linear activation function or logistic function at the hidden layer”,“This depends on the task, regression or classification, tansig or sigmoid”,“All the answers are great”,“In terms of using NNs for prediction, you have to use linear activation function for (only) the output layer. When you normalize your data into [0, 1] and then use sigmoid function, the accuracy may decrease” as of June, 6th, 2019.

Also, “Now with the above transformations a ReLU activation function should never be able to fit a x² curve. It can approximate, but as the input grows the error of that approximated function will also grow exponentially, right? Now x² is a simple curve. How can ReLU perform better for real data which will be way more complicated than x² ?”, (Professionals discussion forum, Data Science of StackExchange (2018)). According to the discussion above, ReLU activation function was assumed to fit the curve x², a non-linear deterministic (non-probabilistic) function in a Neural Network. The dilemma here is that which scientific methods or criterion applied to fit x² with a ReLU activation function.

Currently, artificial intelligence neural network algorithms, particularly Activation Functions were accused of lack of transparency, regulations, supervision, operating in secrecy contained a Black Box and difficulty in terms of explainability with many human biases, their final ranking has to be questionable full of bad recommendations, also, were accused will be used to determine the next US President and exposes children to unsolicited sexual videos contents. The objective of this paper is to propose Definite Rules, not Trial and Error called “Jameel’s ANNAF Deterministic Criterion and Lemma” for the choice of advanced optimized Activation Functions of Deep Learning Artificial Neural Network. The paper started with the Introduction, Material, and Methods, new research findings, and Lemma will be proposed. The paper will be crowned up with concluding remarks.

2. Materials and Methods

2.1 Materials

2.1.1 Basic Definitions

Deterministic (probabilistic): A deterministic model is one in which every set of variable states is uniquely determined by parameters in the model and by sets of previous states of these variables; therefore, a deterministic model always performs the same way for a given set of initial conditions.

Probabilistic (stochastic): In a stochastic model, randomness is present, and variable states are not described by unique values, but rather by probability distributions.

Stochastic Neural Network: Stochastic neural networks are a type of artificial neural netwoks built by introducing random variations into the network, either by giving the network's neurons stochastic activation functions, or by giving them stochastic weights. An example of a neural network using stochastic transfer functions is a Boltzmann machine. Each neuron is binary valued, and the chance of it firing depends on the other neurons in the network. Stochastic neural networks have found applications in Risk Management, Oncology, Bioinformatics, and other similar fields. Deterministic Neural Network: deterministic system is a non-probabilistic system. A. M. Abdallah (2018) defined Deterministic Neural Network as “If the activation value exceeds the threshold, there is a probability associated with firing. That is there is a probability of the neuron not firing even if it exceeds the threshold. If the probability is one then that update is Deterministic”.

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 2

Curve fitting: Curve fitting is one of the most powerful and most widely used analysis tools in Origin. Curve fitting examines the relationship between one or more predictors (independent variables) and a response variable (dependent variable), with the goal of defining a "best fit" model of the relationship.

Origin provides tools for linear, polynomial, and nonlinear curve fitting along with validation and goodness-of-fit tests. You can summarize and present your results with customized fitting reports.

Rank Models Tool: The Rank Models tool lets you fit multiple functions to a dataset, and then reports the best fitting model. Results are ranked by Akaike and Bayesian Information Criterion scores.

2.1.2 and Online Resource Materials

Two of the fundamental pillars of the proposed AI-ML- Deterministic Activation Functions Selection Criterion to be presented under the paper’s methodology were that the Activation Functions shall be EMANATED from the AI- ML-Purified Data Set under consideration and curve fitting for the Best Fitted Deterministic Function shall be carried out. The paper uses RESEARCHGATE DISCUSSION FORUM to be able to adequately provide advanced materials to frequently and successfully perform Deterministic Best Fitting selection. The DISCUSSION goes as follows:

Gajendra Pal Singh Raghaya (11th November, 2012) posted “We are using TableCurve2D for fitting our data. Problem with this software it is windows based and . We need a free software equivalent TableCurve2d (I mean similar functions) which can be run in command mode.I will highly appreciate if some one suggest free software which take my data and fit it in large number of equations by regression or non-regression. Finally it give me equation in which my data fit best”. Note that a direct extract from TableCurve2D official site (http://www.sigmaplot.co.uk/products/tablecurve2d/tablecurve2d.php) stated it Cover the following TOPICS:

 Quickly Find the Best Equations that Describe Your Data  Automation Takes the Trial and Error Out of Curve Fitting  Fit User Defined Equations  Accurately Extrapolate Any Data Set  Graphically Review Curve Fit Results  Compare Models Using Meaningful Numeric Information  Effectively Manage Complex Data Sets  Precisely Model Exotic Data Sets  Flexible Output Options  Maximize Your Productivity with Automation

Quickly Find the Best Equations that Describe Your Data

TableCurve 2D® gives engineers and researchers the power to find the ideal model for even the most complex data, by putting thousands of equations at their fingertips. TableCurve 2D's built-in library includes a wide array of linear and nonlinear models for any application including equations that may never have been considered, from simple linear equations to high order Chebyshev polynomials. TableCurve 2D is the automatic choice for curve-fitting and data modeling for critical research. A robust fitting capability for nonlinear fitting that effectively copes with outliers and a wide dynamic Y data range.

Robin Patrick Mooney (2nd February, 2014), I can recommend Curve Expert; http://www.curveexpert.net/. The free basic version is free and very easy to use. Very helpful tool! Lambro Angelo Johnson (11th November, 2012), “If you have a model you could try DynaFit (http://www.biokin.com/dynafit/). It uses numerical integration and requires some effort to master its programming but examples can be copied and modified to suit. It is available

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 3 without charge to university researchers”. Giuseppe D’Auria (11th November, 2012), “My experience is that almost all mathematics could be done using "" it is open source, it is command-line or script based, scalable and there are a lot of packages.....take a look at http://cran.r-project.org/ and http://www.bioconductor.org/”. George E Johnson (11th November, 2012), “PROAST is a great bit of software designed by Wout Slob in the RIVM. The European food standards agency use it for these purposes and it is updated regularly and heavily validated. http://www.rivm.nl/en/Library/Scientific/Models/PROAST”. Fabio Mariotti (11th November, 2012), “Hi,Here is a list of linux alternatives to Originlab which I guess is a pretty similar software to what you use: http://alternativeto.net/software/originlab/?platform=linux”. Nicolas Urbina-Cardona (11th November, 2012),” Curve expert could be helpful: http://www.curveexpert.net/”. Robin Patrick Mooney (2nd February, 2014), “I can recommend Curve Expert; http://www.curveexpert.net/. The free basic version is free and very easy to use. Very helpful tool! Aybek V, Khodiev (9th September, 2014),” You can use free trial of "Eureqa" software by Nutonian (google it). It was free sometime ago, by still free 30 days trial”. M. A. Aghajani (2nd February, 2015),” I can recommend TableCurve 2D and CurveExpert Professional. SigmaPlot 13 now is working well. Its model library is very full and it is possible to add and edit models”.

Danny Kowerko (3rd March, 2015), “I recently tried http://www.mycurvefit.com where u just copy paste your data and then you can even quickly define own equations...http://www.mycurvefit.com”. Gerro J. Prinsloo (6th June, 2015),” We use linear and non-linear mathematical cure fitting in intelligent power systems, especially in the parametric modelling of thermodynamic and electrical concentrated solar power systems, smart microgrid power optimization and scheduling optimization. In our desktop experiments, we tried LabFit but found it very difficult to use. The software is unable to read data multicolumn data from csv and excel files for example. Then learned about Datafit by Oakdale Engineering and found it much easier to use. When you use it, select "all models" in the curve fitting strategy then it ranks the solutions and polynomials in terms of best fit hierarchy options. One can then plot each to curve in a different color in overlay mode and see/judge the curve fit for each solution and parameter set. The options for DataFit with non linear curves is however limited and we struggled for example to model simple daily energy and electrical load profile curves (used in load forecasting archetypes) with the model options available in the menus. Low price end, but good to see trial version first. Simfit is another, free open-source option for Windows and Linux usef in simulation curve fitting with plotting. like Data-fit, the library of models allow for user- defined equations to be added to the model set”. Adam Majewski (11th November, 2015), “https://github.com/zunzun/pyeq2. Zun zun. I have used it's online version but now it is not working, but her is a git gepository. Ther is also Zunzun.com discussion group - Google Groups. https://groups.google.com/forum/#!forum/zunzun_dot_com”.

Afshin Arjhangmehr (8th August, 2016), “I should recommend qtiplot which is really handy in curve fitting (but limited in 2 coloumn data”. Ziad Boutaanios (9th September, 2016), “www.mycurvefit.com is pretty good but you're limited to a 20 analyses or so. If you want more you have to subscribe.I use QtiPlot, has some pretty good built-in functions and allows for plugins and user-defined fit functions. It's a pretty powerful plotter too and it runs very fast. Most Linux distributions already have it as an installable package from repositories”. Tom Calvin O Haver (9th September, 2016), “Checkout https://terpconnect.umd.edu/~toh/spectrum/InteractivePeakFitter.htm”. Abdullah AL- Numan (1st January, 2017), “Data master 2003 is a good free software for fitting and data acquisition , allowing constrained nonlinear fitting, and user defined models with limited statistical evaluation of fits”. Kheir-Eddine Ramdane (2nd February, 2017), “Origin software is one of the best for all kind of fittings, ect...... , http://www.lightstone.co.jp/lsieng.html”. Sajeewa W Dewage (2nd February, 2017), “Curve Expert is the one I use for curve-fitting. It is very straightforward and does a very good job at fitting the data. It has helped me a lot in my research. It has versions for all platforms as well. The basic suite is free”. Tomasz Cepowski (4th April, 2017), “I can recommend you ndcurvemaster: www.ndcurvemaster.eu (for Win and Mac).for auto fitting of an unlimited number of input variables: x1, x2, x3,..., xn and their combinations: x1*x2, x1*x3, x2*x3,..., xn-1*xn”. Daniel Paton (9th September, 2017), “Zunzun website is active again. This is the best... However you can use the libraries in Python in

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 4 your computer installing the appropriate packages in Linux...”. Rahal Widanagemage (12th December, 2017),” my suggestion is to use the myassays.com, https://www.myassays.com/four-parameter-logistic-curve.assay”.

Toon Smetsers (1st January, 2018), “Try AssayFit Pro, it is a free and online solution and can be used in Excel, R and python too. http://www.assayfit.com”. Jose Bartheld (1st January, 2018), “Curve Expert Pro for curve fitting and data analysis!! & It is a cross-platform software. https://www.curveexpert.net/products/curveexpert-professional/”. Robert Zamenhof (3rd March, 2018), “I hate sounding mercenary, but for $10 you can download my very flexible but VERY easy-to-use polynomial curve-fitting program from the website given below. PolyFit 1.1 does linear, quadratic, cubic, & exponential regression, and also does quadratic and cubic regression for TWO independent variables (i.e., a 'family' of curves). Input data are imported from EXCEL, results are tabulated and plotted and can be exported to EXCEL. 'Goodness-of-fit' parameters are calculated. SDs or CVs for the input data can also be entered. https://www.drsimplescience.com/fitting-tool.html”. Easwaran Krishnan (5th May, 2018), “I would recommend you to use curve expert free version. It works well in Mac, Windows and Linux”.

Tomasz Cepowski (5th May, 2018), “I would like to announce that I have a created and published a new free version of NdCurveMaster software for one independent variable. Please feel free to check it out, any feedback is welcome.Click on https://www.ndcurvemaster.com and go to download, it is called NdCurveMaster 2D. It is for Mac and PC”. Karim Kamal (12th December, 2018), “I recommend mathematica”. Emad Elhout (2nd February, 2019), “I recommend datafit program”. Navid Farhoudi (2nd February, 2019), “If you don't mind writing some lines of script, python and R are good choices for doing this task”. Kishan Singh Rawat (7th July, 2019), “I recommend Sigmaplot12.0, it is free for one month”. Anton Vrdoljak (7th July, 2019), “GeoGebra (free, multi-platform software) is dynamic mathematics software for all levels of education that brings together geometry, algebra, spreadsheets, graphing, and calculus in one easy-to-use package.Curve Fitting is supported and very easy to figure...”. Reza Toorajipour (8th August, 2019), “Euerqa is a software that creates formulas for several variables in large sets of data”. Ahmad Bagheri (9th September, 2019),” I recommend LABFIT, it is free for one month (or more) and very user friendly. This software fitted response(y) with 6 independent variables(X1, X2 and ...) and 10 adjustable parameters(maximum)”.

2.1.3 Software Recommendations

I am pleased like to recommend the following especially for 2D and 3D Data Set. One can try LabFit. According to extracts from their official site (http://www.labfit.net/), stated that LabFit can perform Curve Fitting (nonlinear regression - least squares method, Levenberg-Marquardt algorithm -, almost 500 functions at the library with one and two independent variables, functions finder, option that let you write your own fitting function with up to 150 characters, 6 independent variables and 10 parameters).

LAB Fit has a menu for curve fitting and the main programs of this menu use nonlinear regression. LAB Fit fits functions of one and several independent variables (from 1 up to 6), admitting uncertainties at the dependent variable (Y) and also at the independent variables (X). In case of uncertainties in X and in Y it's made a pre-fit not considering the uncertainties in X, that later on are transferred to Y by error propagation. At LAB Fit library there are more than 200 functions with 1 independent variable and almost 280 functions with 2 independent variables. The user disposes of a finder functions program. If necessary, there is an option so that the user can write its own fit function. Once determined the fit parameters, it's possible to extrapolate the fit function and, for the 2D and 3D cases, the graph of the obtained function is shown. For the 2D case, beyond the extrapolation possibility, the user can even include error bar and confidence band to the graph. LAB Fit has also a menu destined to the treatment of similar data, non-similar data and error propagation.

Second recommended software is DATAFIT(http://www.oakdaleengr.com/) by Oakdale Engineering. According to the extracts from their official site, DataFitX version 2.0 is a COM component (in-process DLL) that allows you to perform nonlinear curve fitting or cubic spline interpolation from within your program with only a few simple lines of code. It is the same powerful curve fitting engine as used and verified in DataFit software. Can be used with Visual Basic, Visual Basic .NET, Visual C++, Visual C++ .NET, Visual C# .NET, Delphi, Excel, Access,

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 5

VBScript, VBA enabled applications or any other development environment that supports COM. Try a fully featured evaluation version free for 30 days, or click here to order online now.”

Third recommended software is SIMFIT (https://www.simfit.org.uk/user_defined_models.html ). According to the extracts from their official site,the SIMFIT package has a comprehensive library of models for simulating or fitting, which will cover most situations. However, you may have to construct your own user-defined models in cases involving procedures like the following ones.

 Systems of nonlinear differential equations and Jacobians  Sets of linked or independent nonlinear equations  Evaluation of special functions  Probability integrals and inverses  Integrating one or more functions of one or several variables  Locating roots of nonlinear equations in one or several variables  Constrained nonlinear optimization using partial derivatives  Evaluating or fitting convolution integrals  Models with swap over points or discontinuities

SIMFIT plot 3D Plot for z=f(x,y)

The fourth recommended software is POLYFIT. According to the extracts from their official site, Polyfit can fit polynomial functions to linear, quadratic, cubic, or exponential data using the form Y = f(x), and it can also fit family of curves using the form Y = f(X,Z).

The Firth recommendation, Benjamin M. Bolker et al. (2013) in their paper stated that “ R is convenient and (relatively) easy to learn, AD Model Builder is fast and robust but comes with a steep learning curve, while BUGS provides the greatest flexibility at the price of speed”.

The sixth recommended powerful fitting software is ORIGINLAB( https://www.originlab.com/). According to the extracts from their official site stated that ORIGINLAB provides various tools for linear, polynomial and nonlinear curve and surface fitting. Fitting routines use state-of-the-art algorithms. You can select from close to 200 built-in fitting functions arranged in categories, Create your own fitting function using our Fitting Function Builder wizard and Fit with explicit and implicit functions. Please, see below:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 6

Figure 1: Printscreen of OriginLab showing Curve Fitting and Peak Analysis functions Source:originlab.com (2019)

The seventh recommended software is XLSTAT. According to the extracts from their official site (https://www.xlstat.com/en/solutions/features/nonlinear-regression-genfit) stated that XLSTAT is a tool to fit data to any linear or non-linear function.When to use nonlinear regression, Nonlinear regression is used to model complex phenomena which cannot be handled by the linear model. XLSTAT provides preprogrammed functions from which the user may be able to select the model which describes the phenomenon to be modeled. The user is also free to write other nonlinear functions.

Options for nonlinear regression in XLSTAT: When the model required is not available, the user can define a new model and add it to their personal library. To improve the speed and reliability of the calculations, it is recommended to add derivatives of the function for each of the parameters of the model. Please, see below:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 7

Figure 2: Printscreen of XLSTAT showing available Curve Fitting Equations Source:XLSTAT(2019)

2.2 Methods

2.2.1 Existing Activation Functions and their Relationships with AI-ML-Purified Data Set

The existing Activation Functions has no correlation(s) in whatsoever with the AI-ML-Purified Data Set under consideration, in fact, their choices has no definite Rule of Thumb; it is just Trial and error as shown in the figure below:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 8

Figure 3: Activation Functions in ANN Source: Google Images (2019)

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 9

2.2.2 Who are the Competent and Eligible Deterministic Activation Functions for a successful Neural Network

Figure 4: Question of Competent and Eligible Activation Functions Source: Google Images (2019)

ACTIVATION FUNCTIONS Allows Artificial Neural Network to learn complex functional mappings from DATA and make sense of something complicated and Non-linear to produce meaningful output signal. Thus, should not be a Trial and error or a Black-Box Assumptions.

Biologically, Neuron performs three basic functions, namely: Receive signals (or information); Integrate incoming signals (to determine whether or not the information should be passed along); and, Communicate signals to target cells (other neurons or muscles or glands). However, this cannot be done successfully without action of a Non-linear Function (Brain of Neuron) residing in a Neuron of Human Brain EMANATED from incoming signals (information). This Non-linear Function (Brain of Neuron) residing in a Neuron of Human Brain EMANATED from incoming signals is what we called Activation Function of a Deep Learning Artificial Neural Network.

The Author PROPOSED that the Competent and Eligible Activation Functions for the successful conduct of Artificial Neural Networks are the Activation Functions EMANATED from the AI-ML-Purified Data Set under consideration satisfied AI-ML-Jameel’s Stochastic or Deterministic Criterion because of the following SCIENTIFIC FACTS:

(1) They EMANATED from the referenced AL-ML-Purified Date Set and satisfied AI-ML-Jameel’s Stochastic and or Deterministic Criterion; (2) They have a very strong (if not perfect) CORRELATION with the referenced AI-ML-Purified Data Set. A link between the Data Set and Activation Functions MUST be strongly established since Artificial Neural network uses past historical data to predict the future of a given task with the aid of machines; (3) They relate better to the referenced AI-ML-Purified Data Set then the existing Assumed and Trial and error Activation Functions; (4) They indeed describe the distribution of our referenced AI-ML-Purified Data Set, which is a listing or function showing all the possible values (or intervals) of the data and how often they occur; (5) They represent real, virtual and un-virtual information about our referenced AI-ML-Purified Data Set;

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 10

(6) They indeed captured the Symmetric, Left Skewed, Right Skewed, Mesokurtic, Leptokurtic, and Platykurtic properties of our referenced AI-ML-Purified Data Set; (7) They contain real, virtual and un-virtual information related to Measures of variability (the range, inter-quartile range, and standard deviation) and Measures of Central Tendency (Mean, Mode and Median, Minimum and Maximum) of our referenced AI-ML-Purified Data Set; (8) They capture real, virtual and un-virtual information about the Correlation (autocorrelation) among the elements in our referenced AI-ML-Purified Data Set; (9) In the case of Bivariate AI-ML-Data Set, they captured real, virtual and un-virtual information about Measures of Association (Covariance and Correlation) of our referenced AI-ML-Purified Data Set; (10) They capture real, virtual and un-virtual information whether or not the parameters of our referenced AI-ML- Purified Data Set are constant over time; and, (11) They also captured the presence of outliers in our referenced AI-ML-Purified Data Set under consideration.

Referenced AI-ML-Purified Data Set means our referenced Artificial Neural Network Data Set shall possess the following QUALITIES: a. Accuracy and Precision b. Legitimacy and Validity c. Reliability and Consistency d. Timeliness and Relevance e. Completeness and Comprehensiveness f. Availability and Accessibility g. Granularity and Uniqueness

Therefore, the author proposed that the practice of “TRIAL AND ERROR” choice of Assumed Activation Functions should be abandoned, however, the choice of Activation Functions should follow a “DEFINITE RULES”, and thus, the Author proposed “JAMEEL’S ANNAF DETERMINISTIC CRITERION (2019)” as follows:

2.2.3 Propose Jameel’s ANNAF Deterministic Criterion

Having gathered all the available Software and Online Resource Materials for the goodness of fits of Non-linear Deterministic Functions as well as the SCIENTIFIC FACTS, then we are now ready to propose AI-ML- Deterministic Activation Functions Selection Criterion.

ANNAF means Artificial Neural Network Activation Functions.

For a Neural Network that require DETERMINISTIC ACTIVATION FUNCTIONS can satisfy the following proposed criterion:

(i) The function 푓(푥) shall be EMANATED from the referenced AI-ML-Purified Data Set. The essence of the function 푓(푥) to be EMANATED from the referenced AI-ML-Purified Data is to build an incredible and sophisticated Activation Function(s) that has the BEST MATCH AND OR TUNE with the set of referenced AI-

ML-Purified Data since neural network is a system made to learn a function from data. The Activation Functions obtained from the referenced AI-ML-Purified Data can be used to build an extra-ordinary Neural Network

Artificial Intelligence System that may defeat Human Intelligence if trained well;

(ii) A curve fitting for Best Fitted Deterministic Function shall be carried out, the function 푓(푥) whose:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 11

(a) Rank is Unity (1)

(b) Fattiness Standard Error is smaller than any other on the list;

(iii) The function 푓(푥) shall be Nonlinear; (vi)The function 푓(푥) shall have a Range; (v) The function 푓(푥) shall be Continuously Differentiable; (vi) The function 푓(푥) shall be Monotonic; (vii) The function 푓(푥) shall be Smooth Function with a Monotonic Derivative; (viii) The function 푓(푥) shall Approximates Identity near the Origin. If these failed Discard the 1st rated function 푓(푥), repeat (1) to (8) until the qualified Deterministic Activation Function is EMANATED from our referenced AI-ML-Purified Data. NOTE: Deep Learning Artificial Neural Network’s Hidden and output Layers consist of at least one, two or more Best fitted Activation Functions EMANATED from our AI-Data Set, therefore, the RANK: UNITY (ONE) in (i)- (a) and Fattiness Standard Error (i)-(b) of the criterion means when a function whose Real “Rank =1” was chosen and it satisfied (i) to (viii) then the next function on list whose Real “Rank=2” will assume “New Rank=1” and will be tested to satisfy all the eight (8) axioms until we have the required number of BEST (EXCELLENT) Activation Functions needed to carry out our Deep Learning Artificial Neural Network.

2.2.4 Proposed Jameel’s Deterministic Lemma:

All the TOP-RANKED Nonlinear Monotonic Continuously Differentiable Deterministic Functions EMANATED from referenced AI-ML-Purified Data satisfied Proposed Jameel’s ANNAF Criterion are EXCELLENT DETERMINISTIC ACTIVATION FUNCTIONS to perform well-informed Forward and Backward Propagations of an Artificial Neural Network.

4. Result

4.1 Software and AI Data

The paper employed the “TABLECURVE 2D CURVE FITTING” software of “SYSTAT”. The software automatically fits 3,665 BUILT-IN EQUATIONS FROM ALL DISCIPLINES to discover the ideal MODELS that describe Data. “TableCurve 2D is first and only program that completely eliminates endless “TRIAL and ERROR” by automating curve fitting process”. This is because, the software statistically RANKED the LIST of candidate equations according to “Jameel’s ANNAF Deterministic Criterion of Deep Learning Artificial Neural networks” proposed above.

Also, the paper used the SAMPLE DATA of “TEMPERATURE VS CONDUCTANCE” provided by TABLECURVE 2D Software as shown below:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 12

Source: https://systatsoftware.com/products/tablecurve-2d/tablecurve-2d-curve-fitting/

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 13

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 14

4.2 Advanced Optimized Deterministic Temperature vs Conductance Activation Functions

Now let find the BEST fitting Functions of the above DATA SET:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 15

The FIRST Ranked Function is:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 16

We can view all the FUNCTIONS that fitted the Sample DATA as follows:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 17

Automatically fitting and added 2224 Functions at the end of 1:45 minutes as follows:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 18

The SECOND Ranked Function is:

The List (RANKS) and NATURE of about 2153 fitted Functions:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 19

This means we have about 2224 SET OF ACTIVATION FUNCTIONS (mostly Deterministic) that can be served as ACTIVATION FUNCTIONS EMANATED from our SAMPLE DATA to perform Deep Learning Artificial Intelligence on “TEMPERATURE vs CONDUCTANCE”.

Thus, these satisfied first axiom of “Jameel’s ANNAF Deterministic Criterion” that says “(i) The function shall be EMANATED from the referenced AI-ML-Purified Data Set”.

Now we will work on our 2224 set of Activation Functions using “Jameel’s ANNAF Deterministic Criterion “until we obtain the required number of Best Activation Functions to carry out our Deep Learning Artificial Intelligence. Note that any qualified Stochastic Activation Function shall satisfy “Jameel’s ANNAF Stochastic Criterion”.

This paper will show the First Derivatives of the THREE TOP-RANKED Activation Functions as follows:

푏 1- Lorentzian function is a function given by푦 = (1/푛) ( ), where a and b are constants, 푥 ≠ 푎. Note that ((푥−푎)2 +푏2) this was not from TableCurve 2D Software:

푐 푒 푔 푖 푘 2- 푦 = 푎 + 푏푥 + + 푑푥2 + + 푓푥3 + + ℎ푥4 + + 푗푥5 + 푥 푥2 푥3 푥4 푥5

푐 푒 푔 푖 3- 푦 = 푎 + 푏푥 + + 푑푥2 + + 푓푥3 + + ℎ푥4 + + 푗푥5 푥 푥2 푥3 푥4

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 20

4.3 Backward Propagation: First Derivatives of the Three Top-Ranked Activation Functions

4.3.1 First (1ST) Ranked Activation Function:

Derivative Calculator Command: (1/n)*(b/((x-a)^2+b^2))

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 21

4.3.2 Second (2ND) Ranked Activation Function:

Derivative Calculator Command: a+b*x+c/x+d*x^2+e/x^2+f*x^3+g/x^3+h*x^4+i/x^4+jx^5+k/x^5

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 22

Now if a=b=c=d=e=f=g=h=i=j=k=1 then we have:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 23

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 24

4.3.3 Third (3ND) Ranked Activation Function:

Derivative Calculator Command: a+b*x+c/x+d*x^2+e/x^2+f*x^3+g/x^3+h*x^4+i/x^4+jx^5

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 25

Now if a=b=c=d=e=f=g=h=i=j=1 then we have:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 26

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 27

Thanks to “TABLECURVE 2D CURVE FITTING” software of “SYSTAT” and “DERIVATIVE CALCULATOR”, the Three Top-Ranked Activation Functions are now DIFFERENTIABLE.

Thus, satisfied the following AXIOMS of “Jameel’s ANNAF Deterministic Criterion”:

(i) The THREE (3) functions 푓(푥) s EMANATED from the referenced (Temperature vs Conductance) AI-ML-

PURIFIED DATA SET.

(ii) The THREE (3) functions 푓(푥) s curve fittings have:

(a) Rank = 1

(b) Fattiness Standard Error is smaller than any others on the list;

(iii) The functions 푓(푥)푠 are Nonlinear; (vi)The functions 푓(푥)푠 all have Ranges; (v) The functions 푓(푥)s are Continuously Differentiable; (vi) The functions 푓(푥)푠 are Monotonic; (vii) The functions 푓(푥)푠 are Smooth Function with a Monotonic Derivative; And so on. Therefore, if the Three Top-Ranked Activation Functions satisfied the remaining axiom of “Jameel’s ANNAF Deterministic Criterion” then are ready for the successful conduct of Deep learning Artificial Neural network on “TEMPERATURE vs CONDUCTANCE” otherwise, the iterations shall be repeated until the Activation Functions that satisfied the Criterion obtain.

Guarantee, if the first three deterministic functions satisfied Jameel’s ANNAF Deterministic Criterion are excellent Activation Functions to successfully conduct “TEMPERATURE vs CONDUCTANCE” Deep Learning Artificial Neural Network. Subsequent Functions on the list are also good Advanced Optimized Activation Functions.

The direction of the future research would work towards achieving SUPER-INTELLIGENT Deep Learning Artificial Neural Networks using “Jameel’s ANNAF Stochastic Criterion” and “Jameel’s ANNAF Deterministic Criterion”.

5. Conclusion

Currently, we solidly relied on Deep Learning Neural Network output without having a clear understanding of their capabilities or training processes. Activation Functions (Stochastic or Deterministic) are the foundation of Artificial Neural Networks, without them, no Neural Network. Unfortunately, the right selection of activation functions has been serious and heated debates among the professionals, scientists, technologists, policy-makers, decision-makers, and even U.S. Congress among others. Artificial Neural Network Algorithms was accused of lack of transparency, regulations, supervision, operating in secrecy contained a Black Box and difficulty in terms of explainability with many human biases and their final ranking has to be questionable full of bad recommendations, also, were accused will be used to determine the next US President and exposes children to unsolicited sexual videos contents.

This paper has proposed a criterion (definite rules) and Lemma in which deterministic activation functions will be properly determined thereby opening the “Black Box” of Deep Learning Artificial Neural Network for the successful conduct of well-informed Artificial Intelligence and Machine Learning processes.

I strongly believe the proposed criterion and Lemma will notably and remarkably shape the way we handle Deep Learning Artificial Neural Networks since the activation functions will be EMANATED from the referenced AI-MI- Purified Data Set.

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 28

Thus, the Advanced Optimized Activation Functions (stochastic and deterministic) obtained from the referenced AI- ML-Purified Data satisfied “Jameel’s ANNAF Stochastic and or Deterministic Activation Functions Criterion” can be used to build an extra-ordinary Deep Learning Artificial Neural Network Systems that may defeat Human Intelligence if properly trained well.

Furthermore, this research REVEALED that the Advanced Activation Functions satisfied Jameel’s ANNAF Stochastic or Deterministic Criterion depends on the REFERENCED PURIFIED AI DATA SET, TIME CHANGE and AREA OF APPLICATION (acronym DTA) as shown in the figure below:

Figure 5: Optimized Activation Functions depends on AI DATA, TIME CHANGE & AREA OF APPLICATION Source: The Author (2019) This is against the traditional Trial and Error set of assumed Activation Functions, INDEPENDENT of REFERENCED PURIFIED AI DATA SET, TIME CHANGE and AREA OF APPLICATION (DTA).

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 29

These IDEAS was SUMMARIZED in the following FIVE (5) Youtube Videos:

(1) https://www.youtube.com/watch?v=nth3cJqgFts&t=5s

(2) https://www.youtube.com/watch?v=lcyR4TCOBFw

(3) https://www.youtube.com/watch?v=15NgJh71KRQ&t=3s

(4) https://www.youtube.com/watch?v=6emMNluHMZg

(5) https://www.youtube.com/watch?v=IlDTNWc7C-8

Declaration of Interest: The Author reports no conflict of Interest. The views expressed in this paper are those of the Author and not his current employer.

Acknowledgments This research paper was a sequel to my Ph.D research work research findings extension with Ahmadu Bello University, Zaria, Nigeria. Firstly, I would like to thank the Federal Government of Nigeria through the National Mathematical Centre for releasing me for my Ph.D study.

Also, I appreciate the Anonymous Reviewer(s) for his/her constructive criticism to improve the quality of this manuscript. My special gratitude and appreciation go to “TABLECURVE 2D CURVE FITTING” software of “SYSTAT” and “DERIVATIVE CALCULATOR”. These materials have tremendously and incredibly increased the speed of the completion of this project work.

Finally, I thank my mum (Hajiya Hauwa Ahmad), my lovely wife Halima and beautiful children Islam and Salman, they are the sources of my Creativity, Energy, and Aspiration. .

References

TABLECURVE 2D SOFTWARE, SYSTAT (2019) available online: https://systatsoftware.com/products/tablecurve-2d/tablecurve-2d-curve-fitting/

Derivative Calculator (2019) available on: https://www.derivative-calculator.net/

Jamilu Auwalu Adamu (2019), Advanced Stochastic Optimization Algorithm for Deep Learning Artificial Neural Networks in Banking and Finance Industries,Risk and Financial Management Journal, Vol 1 No1 (2019) available online: https://j.ideasspread.org/index.php/rfm/article/view/387

RESEARCH GATE DISCUSSION FORUM (2019) , available online: https://www.researchgate.net/post/Free_Software_for_Curve_fitting_or_best_fit_equation

Nair et al. (2010), Rectified linear units improve restricted boltzmann machines, ICML'10 Proceedings of the 27th International Conference on International Conference on Machine Learning Pages 807-814, Haifa, Israel — June 21 - 24, 2010. Djork-Arne Clevert, Thomas Unterthiner & Sepp Hochreiter (2016), FAST AND ACCURATE DEEP NETWORK LEARNING BY EXPONENTIAL LINEAR UNITS (ELUS), Published as a conference paper at ICLR 2016

Klambauer et al. (2017),Self-Normalizing Neural Networks, Institute of Bioinformatics, Johannes Kepler University Linz, Austria.

Lichman, 2013, UCI machine learning repository, URL http://archive. ics. uci. edu/ml 901

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 30

Aman Dureja and Payal Pahwa (2019), Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks, Recent Patents on Computer Science Journal, Volume 12 , Issue 3 , 2019, DOI : 10.2174/2213275911666181025143029

Chigozie Enyinna Nwankpa et al. (2018), Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, available online: https://arxiv.org/pdf/1811.03378.pdf

Soufiane Hayou et al. (2019), On the Impact of the Activation Function on Deep Neural Networks Training, available online: https://arxiv.org/pdf/1902.06853.pdf Schoenholz et al. (2017), DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES, Published as a conference paper at ICLR 2018, available online: file:///C:/Users/pc/Downloads/Deep_Neural_Networks_as_Gaussian_Processes.pdf

Asman Dureja and Payal Pahwa (2019), Analysis of Non-Linear Activation Functions for Classification Tasks Using Convolutional Neural Networks, Recent Patents on Computer Science, Volume 12 , Issue 3 , 2019, DOI : 10.2174/2213275911666181025143029

Casper Hansen (2019) says “Better optimized neural network; choose the right activation function, and your neural network can perform vastly better”, available online: https://mlfromscratch.com/neural-networks- explained/#/

Artist Hans Hoffman wrote, “The ability to simplify means to eliminate the unnecessary so that the necessary may speak.” available online: https://www.brainyquote.com/quotes/hans_hofmann_107805

Barnaby Black et.at (2016), Complying with IFRS 9 Impairment Calculations for Retail Portfolios, Moody’s Analytics Risk Perspectives, the convergence of Risk, Finance, and Accounting, Volume VII, June, 2016. Ben Steiner (2019), Model Risk Management for Deep Learning and Alpha Strategies, BNP Paribas Asset Management, Quant Summit 2019

Bellotti T. and Crook J. (2012), Loss Given Default Models Incorporating Macroeconomic Variables for Credit Cards, International Journal of Forecasting, 28(1), 171-182, DOI: 10.1016/j.ijforecast.2010.08.005

Burton G. Malkiel (2009), The Clustering of Extreme Movements: Stock prices and the Weather, Princeton University, AtanuSaha, Alixpartners, Alex Grecu, Huron Consulting Group, CEPS working paper No. 186 February, 2009.

Chigozie Enyinna Nwankpa et al. (2018), Activation Functions: Comparison of Trends in Practice and Research for Deep Learning, Preprint.

Daniel Porath (2006), Estimating Probabilities of Default for German Savings Banks and Credit Cooperatives, University of Applied Sciences, Mainz, Ander Bruchspitze 50, D – 55122 Mainz

David M. Rowe (2012), Simulating Default Probabilities in Stress Scenarios, Presented to the PRMIA Global Risk Conference, New York, NY, May 14, 2012. David Rich (2019), Responding to the AI Challenge Learning from Physical Industries, @2019 The Mathworks, Inc.

Jamilu Auwalu Adamu (2018), Jameel’s Dimensional Stressed Default Probability Models are Indeed IFRS 9 Complaint Models, Journal of Economics and Management Sciences, Vol 1 No 2 (2018), pp: 104-114, 2018, DOI: https://doi.org/10.30560/jems.v1n2p102.

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 31

Jamilu Auwalu Adamu (2017), Jameel’s Criterion and Jameel’s Advanced Stressed Models: An Ideas that Lead to Non-Normal Stocks Brownian Motion Models, Noble International Journal of Business and Management Research, Vol. 01, No. 10, pp: 136-154, 2017, URL: http://napublisher.org/?ic=journals&id=2.

Jamilu Auwalu Adamu (2016), Reliable and Sophisticated Advanced Stressed Crises Compound Options Pricing Models, Management and Organizational Studies, Vol 3, No 1 (2016), pp: 39-55, 2016, DOI: https://doi.org/10.5430/mos.v3n1p39.

Jamilu Auwalu Adamu (2017), Jameel’s Criterion and Jameel’s Advanced Stressed Models: An Ideas that Lead to Non-Normal Stocks Brownian Motion Models, Noble International Journal of Business and Management Research, Vol. 01, No. 10, pp: 136-154, 2017, URL: URL: http://napublisher.org/?ic=journals&id=2

Jamilu Auwalu Adamu (2017), An Introduction of Jameel’s Advanced Stressed Economic and Financial Crises Models and to Dramatically Increasing Markets Confidence and Drastically Decreasing Markets Risks, International Journal of Social Science Studies , Vol 4, No 3 (2016), pp: 39-71, DOI: https://doi.org/10.11114/ijsss.v4i3.1326

Jamilu Auwalu Adamu (2015),Banking and Economic Advanced Stressed Probability of Default Models, Asian Journal of Management Sciences, 03(08), 2015, 10-18.

Jamilu A. Adamu (2015), Estimation of Probability of Default using Advanced Stressed Probability of Default Models, Ongoing Ph.D Thesis, Ahmadu Bello University (ABU), Zaria, Nigeria.

Joonho Lee et al. (2019), ProbAct: A Probabilistic Activation Function for Deep Neural Networks, Preprint. Under review.

Nassim N. Taleb (2011), The Future has Thicker Tails Past: Model Error as Branching Counterfactuals, presented in Honor of Benoit Mandelbrot’s at his Scientific Memorial, Yale University, April, 2011. Nassim N. Taleb (2011), A Map and Simple Heuristic to Detect Fragility, Antifragility, and Model Error, First Version, 2011 Nassim N. Taleb (2010), Why Did the Crisis of 2008 Happen, Draft, 3rd Version, August, 2010. Nassim N. Taleb (2009), Errors, Robustness, and Fourth Quadrant, New York University Polytechnic Institute and Universa Investment, United States, International Journal of Forecasting 25 (2009) 744 – 759

Nassim N. Taleb (2010), Convexity, Robustness, and Model Error inside the “Black Swan Domain”, Draft Version, September, 2010. Nassim N. Taleb et al (2009), Risk Externalities and Too bid to Fail, New York University Polytechnic Institute, 11201, New York, United States. Nassim N. Taleb (2012), The Illusion of Thin – Tails under Aggregation, NYU – Poly, January, 2012 Nassim N. Taleb (2007), Black Swans and the Domains of Statistics, American Statistician, August 2007, Vol. 6I, No. 3. Mohit Goyal et al. (2019), Learning Activation Functions: A new paradigm for understanding Neural Networks, Proceedings of Machine Learning Research 101:1–18, 2019.

M & Van Ness (1968), Fractional Brownian Motions, Fractional Noises and Applications (M & Van Ness (1968)), SIAM Review: 10, 1968, 422-437 Onali E. & Ginesti G. (2014), Pre-adoption Market Reaction to IFRS 9: A Cross-country Event-study, Journal of Accounting and Public Policy, 33(6), 628-637. Peter Martey Addo et al.(2018), Credit Risk Analysis using Machine and Deep Learning Models, Risks Journal, Risks 2018,6,38;doi:10.3390/risks6020038. Ram Ananth et al. (2019), Opening the “Black Box” , The Path to Deployment of AI Models in Banking, White Paper, DataRobot and REPLY AVANTAGE.

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 32

Reney D. Estember and Michael R. Marana (2016), Forecasting of Stock Prices using Brownian Motion - Monte Carlo Simulation, Proceedings of the 2016 International Conference on Industrial Engineering and Operations Management, Kuala Lumpur, Malaysia, March 8-10, 2016. Sebastian Urban (2017), Neural Network Architectures and Activation Functions: A Guassian Process Approach, Technical University Munich, 2017. Sebastian Raschka (2018), STAT 479: Machine Learning Lecture Notes, http://stat.wisc.edu/~sraschka/teaching/stat479fs2018/. Sven-Patrik Hallsjo, Machine Learning, Deep Learning, Experimental Particle Physics, University of Glasgow. Steven R. Dunbar, Stochastic Processes and Advanced Mathematical Finance, The Definition of Brownian Motion and the Wiener process, Department of Mathematics, 203 Avery Hall, University of Nebraska-Lincoln, Lincoln, NE 68588-0130 Spreedhar T Bharath et al (2004), Forecasting Default with the KMV – Merton Model, University of Michigan, Ann Arbor MI 48109. Soufiane Hayou et al. (2019), On the Impact of the Activation Function on Deep Neural Networks Training, Proceedings of the 36 th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019 TidarukAreerak (2014), Mathematical Model of Stock Prices via a Fractional Brownian Motion Model with Adaptive Parameters Ton Dieker (2004), Simulation of Fractional Brownian Motion, Thesis, University of Twente, Department of Mathematical Sciences, P.O. BOX 217, 7500 AE Enschede, Netherlands Wenyu Zhang (2015), Introduction to Ito’s Lemma, Lecture Note, Cornell University, Department of Statistical Sciences, May 6, 2015. https://www.stoodnt.com/blog/scopes-of-machine-learning-and-artificial-intelligence-in-banking-financial-services- ml-ai-the-future-of-fintechs/ https://medium.com/datadriveninvestor/neural-networks-activation-functions-e371202b56ff https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/ http://www.datastuff.tech/machine-learning/why-do-neural-networks-need-an-activation-function/ https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0 https://www.youthkiawaaz.com/2019/07/future-of-artificial-intelligence-in-banks/ https://news.efinancialcareers.com/uk-en/328299/ai-in-trading-buy-side https://ai.stackexchange.com/questions/7609/is-nassim-taleb-right-about-ai-not-being-able-to-accurately-predict- certain-type/7610 https://towardsdatascience.com/adam-latest-trends-in-deep-learning-optimization-6be9a291375c

The U.S. Hearing Youtube (Technology Companies & Algorithms): https://www.youtube.com/watch?v=vtw4e68CkwU https://www.commerce.senate.gov/2019/6/optimizing-for-engagement-understanding-the-use-of-persuasive- technology-on-internet-platforms https://ai.stackexchange.com/questions/7088/how-to-choose-an-activation-function https://mlfromscratch.com/activation-functions-explained/#/ https://github.com/nadavo/mood.

Backward propagation and Activation Functions:

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 33

https://www.youtube.com/watch?v=q555kfIFUCM

https://www.youtube.com/watch?v=-7scQpJT7uo

https://towardsdatascience.com/analyzing-different-types-of-activation-functions-in-neural-networks-which-one-to- prefer-e11649256209

http://vision.stanford.edu/teaching/cs231n-demos/linear-classify/

http://vision.stanford.edu/teaching/cs231n-demos/knn/

https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle®Dataset=reg- plane&learningRate=0.03®ularizationRate=0&noise=0&networkShape=4,2&seed=0.41357&showTestData=fals e&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX= false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText= false

This paper was accepted on December 10, 2019 by International Journal of Applied Science, IDEAS SPREAD INC, USA Page 34

View publication stats