Time series data analysis using probabilistic and neural network

Chiu-Che Tseng1

1Department of Computer Science and Information Systems Texas A&M University – Commerce, TX 75429, U.S.A.

Abstract We explored several ways to reduce the complexity of the investment decision deliberation that might cause the investors to lose money under urgent decision support system is situations, and, at the same time, to provide the always a popular topic in providing the human user highest quality investment recommendations possible. with an optimized decision recommendation when For portfolio management, Strong [16], Reilly operating under uncertainty in complex environments. and Norton [13] and Jones [9] brought several The particular focus of our discussion is to compare traditional portfolio management strategies. And different methods of artificial intelligence decision Sycara, et al. [17] focused on using distributed agents support systems in the investment domain – the goal to manage investment portfolios. Their system of investment decision-making is to select an optimal deployed a group of agents with different portfolio that satisfies the investor’s objective, or, in functionality and coordinated them under case-based other words, to maximize the investment returns situations. They modeled the user, task and situation under the constraints given by investors. In this study as different cases, so their system activated the we applied several artificial intelligence systems like distributed agents for information gathering, filtering Influence Diagram (a special type of Bayesian and processing based on the given case. Their network) and Neural Network to get experimental approach mainly focused on portfolio monitoring comparison analysis to facilitate users to intelligently issues and has no mechanism to deal with uncertainty. select the best portfolio. Our systems on the other hand react to the real-time market situation and gathers the relevant information 1. Introduction as needed. John, et al. [8] made extensive research on stock selections by applying induction rules into data The investment domain, like many other domains, is a mining applications. Other related research on dynamically changing, stochastic and unpredictable portfolio selection problems has received environment. Take the stock market as an example; considerable attention in both financial and statistics there are more than two thousand stocks available for literature; see Cover [3]and Cover, et al, [4]. a portfolio manager or individual investor to select. In the field of model refinement, there are several This posts a problem of filtering all those available approaches. The value of modeling was first stocks to find the ones that are worth investment. addressed by Watson and Brown [22] and Nickerson There are also vast amounts of information available and Boyd [11]. Chang and Fung [2] considered the that will affect the market to some degree. problem of dynamically refining and coarsening of For all these problems artificial intelligence the state variables in Bayesian networks. However, decision support systems are always the solutions. the value and cost of performing the operations were The decision support systems are to provide the not addressed. Control of reasoning and rational investor with the best decision support under time decision making under resource constraints, using constraints. For this purpose, we propose several analyses of the expected value of computation and artificial intelligence systems like using the Influence consideration of decisions on the use of alternative Diagram as the framework and Neural Network to strategies and allocations of effort, has been explored advice users build their own highly successful by Horvitz [6] and Russell and Wefald [14] Poh and investment portfolios. Horvitz [12] explored the concept of expected value The structure of the paper is as follows. In section of refinement and applied it to structural, conceptual 2, we introduced some related works on structures of and quantitative refinements. Their work concentrated investment decisions for portfolio managements. In on providing a computational method for the criteria section 3 and 4, we described the frameworks of to perform the refinement. However, their work did influence diagram, decision tree and neural networks. not address the need of a guided algorithm to perform In section 5, we specified our experimental settings. the node refinement throughout the network. In our In section 6, we showed our experimental results and previous work Tseng, et al, [20], we used guided explanations. And in section 7, we concluded our methods to perform the conceptual refinement for our paper.. models. We see significant performance improvement of the models after applying the refinement algorithm. 2. Related Work And Tseng [19,20] made extensive researches on applying influence diagram into portfolio selections and comparing our intelligence decision support system with other artificial intelligence systems, such characteristic is how to model the user’s preference as C5.0, and made conclusions that our system will into a set of values for the utility node. Different handle better than C5.0 in the dynamic environment. fields of research have suggested different approaches Starzyk, et al, [15] also brought Self-Organizing for this problem. Some suggest learning from the Learning Array system into economical and financial user’s behavior, while some suggest obtaining data applications by comparing our influence diagrams from a user survey and some simply query the expert decision support system. and assign subjective values. Besides there is few work on the comparison of The third characteristic to consider is the different intelligence decision support systems. In our availability of the knowledge about the structure, paper you will see how different systems work probabilistic knowledge for the prior and the differently. conditional probabilities. There are many variables in a specific problem domain and there might exist 3. Influence Diagram several concepts in the problem domain that are observationally equivalent, which means they are not An influence diagram is a special type of Bayesian distinguishable even with infinite data. To find out network (Figure 1), one that contains the decision which of those are relevant to the problem and the node and the utility node to provide a decision casual relationships among them present a challenge recommendation from the model. Influence diagrams to the knowledge engineer. There has been much are directed acyclic graphs with three types of research and many tools devoted to the learning of the nodes—chance nodes, decision nodes and utility model structure from the data. [4] For the probability nodes. Chance nodes, usually shown as ovals, distribution for the node, there are two methods to represent random variables in the environment. obtain the probabilities. First, the probability Decision nodes, usually shown as squares, represent distributions can be based on frequency by obtaining the choices available to the decision-maker. Utility the data from gathered statistics. The second method nodes, usually of diamond or flattened hexagon shape, is to obtain the probability distributions through represent the usefulness of the consequences of the knowledge acquisition sessions from the domain decisions measured on a numerical utility scale. The experts, who convey their subjective beliefs. In both arcs in the graph have different meanings based on cases, the probabilities can be refined through a their destinations. Dependency arcs are the arcs that feedback mechanism. Finally, the size, topology and point to utility or chance nodes representing connectivity of the model should also be considered. probability or functional dependence. Informational Applying good knowledge engineering techniques [7] arcs are the arcs that point to the decision nodes throughout the construction of the model will help implying that the pointing nodes will be known to the keep the network manageable. decision-maker before the decision is made. 4. Neural Network

A neural network is a powerful data modeling tool that is able to capture and represent complex input/output relationships. The motivation for the development of neural network technology stemmed from the desire to develop an artificial system that could perform "intelligent" tasks similar to those performed by the human brain. Neural networks Figure 1. A simple influence diagram resemble the human brain in the following two ways:

When using an influence diagram for decision 1. A neural network acquires knowledge through support problems, there are some fundamental learning. characteristics of the influence diagram that one must 2. A neural network's knowledge is stored within take into consideration. These characteristics inter-neuron connection strengths known as influence the data requirements and the choice of the synaptic weights. appropriate influence method. The first characteristic is the granularity of the values for each node. This The true power and advantage of neural networks characteristic affects the memory requirement for lies in their ability to represent both linear and non- storing the probabilities and the computational time linear relationships and in their ability to learn these required for updating the probabilities. The more relationships directly from the data being modeled. values within each node, the larger the memory The most common neural network model is the required and the longer it will take to propagate the multilayer perceptron (MLP). This type of neural probability update. The second characteristic is the network is known as a supervised network because it integration of the user’s preference into the utility requires a desired output in order to learn. The goal of node. This characteristic will affect the decision this type of network is to create a model that correctly outcome of the model. Given different preferences maps the input to the output using historical data so among users, the model might return a different that the model can then be used to produce the output decision recommendation. Another issue of this

when the desired output is unknown. A graphical collected from the Compustat database from the representation of an MLP is shown below in Figure 2. period of 1993 to 2002. There are total 500 testing data sets, which are the same categories as the training data except they are from the period of 2003. We conducted extensive experiments and found that the one with the best result was using the following parameters:  Number of hidden layers: 1  Input layer 1: Gaussian, 338 neurons  Output layer: Sigmoid  Learning rate: Exponential, 1.0

Figure 2. A MLP neural network model 6. Results In neural networks, an activation function is the function that describes the output behaviour of a The experiment results are summarized in table 1 neuron. The most common activation functions are shown the 2003 performance of all three artificial sigmoid and Gaussian functions. intelligence techniques that are used to select stocks 1 from S&P 500. The following three subsection sdescribes the detail result from each architecture.  Sigmoid function f (µi ) = 1+ e "µi ! 2 2 "µi !  Gaussian function f (µi ) = ce 6.1. Influence Diagram Model

On the 2003 test data, the influence diagram portfolio 5. Experiment Settings obtains its maximum average annual return performance when the portfolio contains 134 5.1. Influence Diagram Model companies out of the 500. The 134 companies produced an average one-year total return of 26.32%, Our influence diagram model of the investment while the average return of all the 500 companies was domain consists of a number of stocks for the 19.27%. investors to construct an investment portfolio. The goal is to maximize the profit from the investment 6.2 . Neural Network Model portfolio. The experiments are written in C++ and built on top of the Netica Belief network package, Unlike our first model, this portfolio obtains its running on a LINUX platform. maximum one year total return performance when the In the experiments we ran, we selected eight portfolio contains 298 companies out of the 500 financial ratio data from the S&P 500 companies as companies on the 2003 test data. This portfolio only the input factors to the system. The training data is produces an average one year total return of 12.649, collected from the Compustat database from the which is even lower than the average of the period of 1993 to 2002. To test the performance, we S&P500’s return, which is 19.268. The root mean used the date from the year 2003 and let the system square error is 0.5032, which is too high to get any made the decision recommendation on which of the reliable results. S&P 500 companies to be included in the investment It is true that the quality of the forecast is highly portfolio. dependent on the network parameters. But after carefully analyzed our data, I found that there exist 5.2. Neural Network Model some cross-sectional problems. The 500 companies are chosen from different industries fields and each Before a neural network can be trained with these has its own economic cycles. For example, all the data, they must be normalised, or called in pre- local “dot com” companies were in their recession processing phase. The following rule has been used cycles at the beginning of this century, while the for normalisation derived from Angstenberger (1996): Chinese companies were in their booming cycles, like 1. Multiplication of the data by the factor: 0.8 ; SOHU and SINA. Besides, the “size effect” [13] is max! min also a critical factor affecting results. The sizes of the 2. Addition of the offset 0.1 – (factor * min); companies under S&P500 index are quite different, In this way the data can be scaled to the range which make their returns different between large-size from 0.1 to 0.9. The trained data can also be companies and small-size companies. All these transformed back to the original scale, due to the problems can not be easily distinguished in the linearity of the formula. general neural network structures. The experiments of neural network model are built on top of the Brainmaker Professional network package, running on a WINDOWS platform. There 6.3.Self-Organizing Learning Array are total 4483 training data sets, which are eight (SOLAR) system financial ratio data from the S&P 500 companies

Starzyk, et al [15] compared our work with their Conference on Uncertainty in Artificial Intelligence, technique in (SOLAR) system: A 1990, pp.475-482. new Self-Organizing Learning Array (SOLAR) [3] Cover, T. M. “Universal portfolios” Mathematical system. SOLAR is capable of handling a wide variety Finance, vol. 1, 1991, pp. 1-29. [4] Cover, T. M. and D. H. Gluss “Empirical Bayes stock of classification problems. It has a regular array market portfolios”, Advances in Applied Mathematics, structure with sparsely interconnected computing vol. 7, 1986, pp. 170-181. elements and local learning rules. Computing [5] Friedman, N., “The Bayesian Structural EM elements choose their connections and define their Algorithm”, In the Fifteenth International Conference own functionality, while learning how to best extract on Machine Learning, 1998. information from their input data. [6] Horvitz, E. J. and M. Barry, “Display of Information SOLAR was constructed without expertise in this for Time-Critical Decision Making”, In the Eleventh field and has not been refined for this specific conference on Uncertainty in Artificial Intelligence, problem. 9 individual SOLAR networks were used in 1995, pp.296-305. [7] Howard, R. A., “Influence to Relevance to Knowledge. this case, each of which yielded 17% to 84% Influence Diagrams”, Belief Nets and Decision independently, which still shows higher volatility and Analysis, 1990, pp. 3-23. stochasticity. However, its high average yield made [8] John, G.H., Miller, P. and Kerber R., “Stock Selection itself worth further research on portfolio selections, Using Rule Induction”, IEEE Expert: Intelligent though, in Adult Income Classification case [15], Systems and Their Application, 1996, pp. 52-58. Bayes algorithms outperformed SOLAR slightly. [9] Jones, C.P., Investments: Analysis & Management. John Wiley & Sons Inc., New York, NY.,2003. Table 1. Result from three architectures. [10] Laskey, K. B. and S. M. Mahoney, “Network Average one year total fragments: Representing knowledge for constructing return probabilistic models”, In the Conference on Uncertainty in Artificial Intelligence, 1997, pp. 334- Influence Diagram 25.32% 341. Feed-forward Neural 12.649% [11] Nickerson, R. C. and D. W. Boyd, “The use and value Network of models in ”, Operation Research, SOLAR 30.1% vol. 28, 1980. [12] Poh, K. L. and E. Horvitz, “Reasoning about the Value of Decision Model Refinement: Methods and 7. Conclusion Application”, In the Ninth Conference on Uncertainty in Artificial Intelligence, 1993, pp. 174-182. We conducted some performance analysis with [13] Reilly, F.K. and Norton, E.A., Investments, Chapter 10. our systems and compared ours with other South-Western, Mason, OH., 2003. intelligence system. Our decision support system uses [14] Russell, S. J. and E. H. Wefald, “Principles of the influence diagram as the decision model; the metareasoning”, Artificial Intelligence, 49: 1991, pp. structural information of the influence diagram plays 361-395. [15] Starzyk, J. A., Zhen Zhu, H. He, and Zhineng Zhu, an important role on the performance of our system. “Self-Organizing Learning Array and Its Application We obtained the structural information from the to Economic and Financial Problems”, In The 3rd domain expert and the information represents what International Workshop on Computational Intelligence the expert’s opinion on the causal relationships in Economics and Finance, in Conjunction with the 7th among the nodes. From the experiment results, we Joint Conference on Information Science, 2003. can see that the influence diagram system works [16] Strong, R.A., Portfolio Construction, Management better than many other artificial intelligence systems and Protection. Thomson Learning, Chapter 15, 24, 25. in a more general situation. This is due to the Cincinnati, OH., 2000. background information given by the domain expert [17] Sycara, K. P. and K. Decker, Intelligent Agents in Portfolio Management. Agent Technology, Springer- when constructing the network. Verlag, 1997. Given the above analysis, we could conclude that [18] Trippi, R.R., Turban, E., Investment Management: by using an artificial intelligence system for portfolio Decision Support and Expert Systems. South-Western, selection has performance edge over the human Boston, MA., 1990. portfolio manager and the market. The systems we [19] Tseng C.C., “Comparing Artificial Intelligence selected for this study are only one among numerous Systems for Stock Portfolio Selection”, In The 9th artificial intelligence systems available. We would International Conference of Computing in Economics like to conduct further study to better qualify and and Finance, University of Washington, Seattle, quantify various artificial intelligence systems for use Washington, 2003. [20] Tseng, C.C., P. J. Gmytrasiewicz and C. Ching, in the portfolio selection domain. “Refining Influence Diagram For Stock Portfolio Selection”, In the Seventh International Conference of 8. References the Society for Computational Economics, 2001. [21] Watson, S. R. and R. V. Brown, “The valuation [1] Angstenberger, J. Prediction of the S&P 500 Index of decision analysis”, Journal of the Royal with Neural Networks. Neural Networks and Their Statistical Society, vol. 141, Part I, 1978, pp. 69- Applications. John Wiley & Sons Ltd. New York, NY., 78. 1996. [2] Chang, K. C. and R. Fung, “Refinement and coarsening of bayesian networks” In the Sixth