Interactive Qualifying Project

Predicting Stock Prices

Student: YIYI CHEN ​ JIUCHUAN WANG

Advisor: Professor HUMI MAYER ​ ​

Term: A,B-2017,C-2018 ​

Student Signature:

Advisor Signature:

Table of Contents Chapter 1: Abstract …………………………………………………………...3 Chapter 2: Executive summary ……………………………………………...4 Chapter 3: Introduction………………………………………………………..5 Chapter 4: Background ……………………………………………………….6 Chapter 5: Research………………………………………………………....14 Chapter 6: Methodology ...……….………………………………………….21 Chapter 7: Results………………………………..…………………………...35 Chapter 8: Discussions………………………………………………………86 Chapter 9: Recommendation………………………………………………..87 Chapter 10: Conclusion……………………………………………………...88 Chapter 11: Citation…………………………………………………………..89

1 1. Abstract

The purpose of this project was to develop models for the accurate prediction of future stock prices for short periods of time. To this end, we constructed four models which were able to make these predictions with a reasonable margin of error. Using our best model, we were able to make accurate predictions for 15 out of 20 stocks for 18 days or more. These models can be used by small investors as a tool for investment decisions.

2 2. Executive Summary

This Interactive Qualifying Project is based on a three-term research about predicting the price of traded on stocks exchanges in short-term. This tool can greatly help the users to invest efficiently with precise prediction of the trend of the stocks. Nowadays, a lot of people holds some extra storage of money that they might want to invest into the market for higher or steadier rate of return, yet not all of them have the chance to invest in such resources. This stock price predicting tool can actually helped those people to get a better rate of return for their investments.

During the first term of this project, the team built the model taking in the autocorrelated previous price of twelve stocks from different sectors of the US stock market and try to make predictions over a period of 18 days. This model, however, doesn’t really give the team satisfactory results. Although the volatility of the stocks were also take in at the end of this term, which was the done by the error band model, there were still stocks whose trend couldn’t be correctly predicted at all.

During the second term, the team tried out a new way to predict the price of stocks taking in the market influence. The error band(volatility) was smaller with this model, yet the predictions were still not 100% good. The team change the time period for the prediction with same stocks. The predictions for several stocks were lying out of the error band. The good day prediction rate, which means the predictions were accurate, was around 50%.

In the third term, the team finalized the mathematical model of the price of a stock and the value of the exchange upon it by doing research and tests. The team try to find the relation between volatility of a certain stock and the accuration of prediction of the stock and test the new model with 8 more stocks (that’s 20 stocks in total). The relation between them, however, wasn’t as successful as expected. The model shall be improved base on the current one in later research. Though the result of improvement on volatility wasn’t satisfying, the model is still good enough to help the investors in stock market to make decisions.

3 3. Introduction

The goal of this Interactive Qualifying Project is to produce a model which accurately predicts the future prices of stocks traded on American exchanges. This model, which was developed and tested over the three academic terms of WPI, is based on the daily closing price of a stock, and it was a real-valued time signal system. It predicts the future prices of stocks in the market and can allow investors to make concrete investment decisions upon it. The development of such a model can be significantly beneficial to new investors who don’t know much about the stock market and desire a method to make a fortune by trading stocks.

The reason for building this IQP and putting so much effort into it is simple: a lot of people are interested in the stock market but don’t know where to start and generate profit. There are a lot of applications and platforms which offer the service of predicting stock prices and helping make decisions. Many new investors, however, don’t know where they can find such service and even though they find it, it can cost them so much that they cannot afford it. This IQP sought to apply signal analysis techniques and try to minimize the influence of the volatility of a stock to build a reliable system that would support the small investors in the stock trading market to make profit.

Over the course of this project, the team researched and implemented a variety of signal analysis techniques, which were used in the creation of several models. These models were tested on a variety of stocks, ranging widely in both value and sector of American industry. The forecasted stock price values produced by each model were compared to the real prices of the stocks in order to determine the accuracy of the prediction. The models became more and more complex as the development of the project went on, and the team took more factors into account to improve its accuracy as much as possible.

4 4. Background 4.1. Grand View Stock represents one’s ownership of a company. Modern stock market could be dated back since 17th century in Dutch. The financial innovation was a invention of transferable government bond. The invention brought new relationship into financial market. Stock market contains millions of transactions every day. Stocks are transferred between either individual investors or large traders, including banks, insurance companies and funds. Stock has been among major four ownership investment; the other three include: business, real estate and precious objects. Literally, stocks certificate shareholders a portion of a company. The net profit from stock comes from the difference between starting stock price and closing stock price. Stock price reflects the relationship between demand for that company stock and supply amount that would sell to investors. In general, stock price is driven by demand in market. In modern stock market, stock exchanges get more complicated. Stock is no longer limited to exchange shares of company with monetary help. Investment in stocks could either win or lose. Social movements, government policies, laws and company’s own development can all affect the stock prices. These factors lead to much more risks for investment. To better predict the future stock price, precise analytical model is highly needed. In the last term of this project, in order to test the scope of our prediction models, we

selected another 4 stocks in each sector. Most of the stocks within additional selection

belong to the relatively small companies (e.g. Boston Properties). The market capital of

each of them is between 10 billion dollars and 20 billion dollars.

5 4.2. Reasons for sector selection 4.2.1. Technology Sector Technology is leading industry in American economy. Every year technology sector devotes large amount of GDP to the . This sector is quite stable and also increases steadily especially in last five years. This stock sector performs amazing, which is shown in the NASDAQ index.

4.2.2. Financial Sector Technology is leading industry in American economy. Every year technology sector devotes large amount of GDP to the United States. This sector is quite stable and also increases steadily especially in last five years. This stock sector performs amazing, which is shown in the NASDAQ index.

4.3. Company Background 4.3.1. Broadcom Limited Founded in 1961, with headquarters in Singapore and San Jose, California, AVGO produces analog and semiconductors. According to the survey in 2014, it is the ninth greatest semiconductor company. And after AVGO claimed to combine with Broadcom, its total revenue in 2016 is $13.24 billion and $5.94 for annual profit.(Cite Y1)

4.3.2. Facebook,Inc The second one is is found in 2014 and has now become the most famous and widely used all around the world. Facebook is one of the most influential social media, which has 2 billion monthly users in 2017. Also Facebook became became the fastest company in Standard & Poor’s 500 index with market capture of $250 billion in 2015.(Cite Y2)

6 4.3.3. Microsoft Corporation The third stock I chose is Microsoft Corporation. Microsoft’s is famous for its software products: Microsoft Windows, Microsoft Office suite and Edge web browser. Microsoft’s revenue is $88.95 billion and has become the most profitable software maker in 2016. (Cite Y3,Y4)

4.3.4. Nvidia Corporation Founded in 1993, Nvidia is based in Santa Clara, California.Its produces GPU, semiconductors, consumer electronics and chip units. Nvidia has 10,000 employees and it is in NASDAQ 100-component. Its revenue $6.91 billion in 2016. (Cite Y5)

4.3.5. Systems, Applications & Products in Data Processing

SAP is a German software and database enterprise to manage business

operations and customer relations with over 335,000 customers. Its

revenue, according to report in 2016, is $22 billion. And SAP is DAX

component (Cite Y6)

4.3.6. Texas Instrument Incorporated

Texas Instruments manufactures semiconductors, embedded processors,

and analog electronics. Based on its sale volume, Texas Instrument has

become top semiconductors company worldwide. It belongs to NASDAQ

100 component and its revenue is $13 billion in 2016.What’s more, Texas

Instrument has more than 43,000 patents. (Cite Y7)

7 4.3.7. Consultants to Government and Industries(CGI Group Inc.) ​ ​ Consultants to Government and Industries is a Canadian IT service and IT consulting firm.The CGI Group Inc. is listed in Toronto Stock Exchange and also is a constituent of S&P 60. The market capital is 15.804 billion. In 2012, CGI became the fifth largest independent business process and IT service provider in the world, and biggest tech firm in Canada. (Cite Y12)

4.3.8. Harris Corporation

Harris Corporation belongs to telecommunications equipment industry

and also is a constituent of S&P 500. The market capital is 18.168 billion.

(Cite Y13) Harris Corporation is headquartered in Melbourne, Florida and

its annual revenue reaches 7 billion dollars.

4.3.9. IPG Photonics Corp ​ ​ IPG Photonics is a internationally operated manufacturer in fiber lasers

and fiber amplifiers. IPG Photonics Corp is headquartered in

Massachusetts. In 2016, the revenue of IPG Photonics is 1 billion, and its

market capital is 12.462 billion. (Cite Y14)

4.3.10. NetApp Inc

NetApp Inc. is an American multinational storage and data management ​ ​ ​ ​ company. It has ranked in the Fortune 500 since 2012. NetApp offers ​ ​ software, systems and services to manage and store data. And NetApp

Inc. is both components of NASDAQ and S&P 500. (Cite Y15) The market

capital is 14.986 billion. (Cite Y16)

8 4.3.11. Berkshire Hathaway Inc.

Berkshire Hathaway Inc. is an American multinational conglomerate ​ ​ ​ holding company headquartered in Omaha, Nebraska, United States. The ​ ​ ​ company wholly owns GEICO, Dairy Queen, BNSF Railway, Lubrizol, ​ ​ ​ ​ ​ ​ ​ ​ Fruit of the Loom, Helzberg Diamonds, Long & Foster, FlightSafety ​ ​ ​ ​ ​ ​ International, Pampered Chef, and NetJets, and also owns 38.6% of Pilot ​ ​ ​ ​ ​ ​ Flying J; 26.7% of the Kraft Heinz Company, and significant minority ​ ​ ​ holdings in American Express (17.15%), The Coca-Cola Company ​ ​ ​ (9.4%), Wells Fargo (9.9%), IBM (6.9%), and Apple (2.5%). Since 2016, ​ ​ ​ ​ ​ ​ the company has acquired large holdings in the major US airline carriers, and is currently the largest shareholder in United Airlines and Delta Air ​ ​ ​ Lines, and a top three shareholder in Southwest Airlines and American ​ ​ ​ ​ Airlines. Berkshire Hathaway has averaged an annual growth in book ​ value of 19.0% to its shareholders since 1965 (compared to 9.7% from the S&P 500 with dividends included for the same period), while ​ ​ employing large amounts of capital, and minimal debt. (Cite J8)

4.3.12. Boston Properties, Inc.

Boston Properties, Inc., a self-administered and self-managed American

real estate investment trust (REIT), is one of the largest owners, ​ managers and developers of Class A office properties in the United

States, with a significant presence in five markets: Boston, Los Angeles,

New York, San Francisco and Washington, DC. The Company was

founded in 1970 by Mortimer B. Zuckerman and Edward H. Linde in ​ ​ ​ ​ Boston, Massachusetts, where it maintains its headquarters. Boston ​ Properties became a public company in June 1997 and is traded on the

New York Stock Exchange under the symbol “BXP.” (Cite J9)

9 4.3.13. Alexandria Real Estate Equities, Inc.

Alexandria Real Estate Equities is a major United States real estate ​ investment trust. Most of Alexandria's properties are offices and ​ laboratories, and it focuses on renting to science and technology

companies, especially in the life sciences. As of 2016, Alexandria had 20

million square feet of space located in 199 properties. Tenants include

Pfizer, Google, Eli Lilly and GlaxoSmithKline. It has properties in various ​ ​ ​ ​ ​ ​ ​ cities throughout the United States, including Boston, San Francisco, New

York City, San Diego, Seattle, and Research Triangle Park. As of 2016, ​ ​ 41% of rental revenue was from properties in the Boston Area, 19% was

from properties in the San Francisco area, and 17% was from properties

in the San Diego area. Alexandria tries to locate its properties around

universities, as opposed to more distant suburban locations. (Cite J10)

4.3.14. Nasdaq, Inc.

Nasdaq, Inc. is an American multinational financial services corporation ​ ​ ​ ​ that owns and operates (and is listed on) the NASDAQ stock market and ​ ​ eight European stock exchanges, namely Armenian Stock Exchange, ​ ​ Copenhagen Stock Exchange, Helsinki Stock Exchange, Iceland Stock ​ ​ ​ ​ Exchange, Riga Stock Exchange, Stockholm Stock Exchange, Tallinn ​ ​ ​ ​ ​ ​ Stock Exchange, and NASDAQ OMX Vilnius. It is headquartered in New ​ ​ ​ York City, and its president and chief executive officer is Adena Friedman. ​ ​ (Cite J11)

10 4.3.15. Bank of America Corp.

Bank of America Corp. is a huge company. Every U.S. citizen has heard

of the well-deserved reputation and massive market share of the

company. By December 31, 2016, the company held 10.73% of all bank ​ deposits in the United States, and it is one of the Big Four banks in the

United States, along with Citigroup, JPMorgan Chase and Wells

Fargo—its main competitors. It has a retail banking footprint that serves

approximately 46 million consumer and small business relationships at

4,600 banking centers and 15,900 automated teller machines (ATMs).

(Cite J1)

4.3.16. BlackRock, Inc. BlackRock, Inc. is not a very famous company, but it actually is the world’s largest asset manager with $5.7 trillion in assets under management as of July 2017. BlackRock operates globally with 70 offices in 30 countries and clients in 100 countries. Due to its power, BlackRock has been called the world’s largest shadow bank. (Cite J2)

4.3.17. Visa Inc.

Visa Inc is another huge company in the financial market. Almost every

single person in this world has heard of its name. It is an American ​ multinational financial services corporation. It facilitates electronic funds ​ transfers throughout the world, most commonly through Visa-branded

credit cards and debit cards.In 2015, the Nilson Report, a publication that

tracks the credit card industry, found that Visa’s global network (known as

VisaNet) processed 100 billion transactions with a total volume of US$6.8

trillion. (Cite J5)

11 4.3.18.

Morgan Stanley is a leading global financial services firm providing ​ ​ ​ , securities, wealth management and investment

management services. With offices in more than 42 countries and more than

55,000 employees, the firm's clients include corporations, governments,

institutions and individuals. Morgan Stanley, formed by J.P.Morgan & Co.

partners Henry Sturgis Morgan(grandson of J.P. Morgan), Harold Stanley

and others, came into existence on September 16, 1935, in response to the

Glass-Steagall Act that required the splitting of commercial and investment

banking businesses. (Cite J6)

4.3.19. HSBC Holdings plc. HSBC Holdings plc. is the world’s seventh largest bank by total assets and the largest in Europe with total assets of US$2.374 trillion (as of ​ December 2016).HSBC has around 4,000 offices in 70 countries and territories across Africa, Asia, Oceania, Europe, North America and South America, and around 37 million customers. As of 2014, it was the world's sixth-largest public company, according to a composite measure by Forbes magazine. (Cite J3)

4.3.20. JPMorgan Chase & Co. JPMorgan Chase & Co. is a U.S. multinational banking and financial ​ ​ ​ services holding company headquartered in . It is the largest ​ ​ ​ ​ bank in the United States, the world’s third largest bank by total assets, with ​ ​ total assets of US$2.5 trillion, and the world's second most valuable bank by ​ ​ market capitalization, after ICBC. It is a major provider of financial services, ​ ​ ​ ​ ​ and according to Forbes magazine is the world's sixth largest public company based upon a composite ranking. (Cite J4)

12

5. Research Before making predictions on stock prices, the team has to be clear about what factors have influence on future stock price and what the team did is utilizing historical data to analyze correlated factors. By analyzing three year historical data (with time range from September 2nd 2014 to September 1st 2017, the historical data included 12 stock prices and two stock market index: NASDAQ Composite and Dow Jones Industrial Average), the team gathered following information.

5.1. Autocorrelation Autocorrelation is a mathematical approach which examines similarity of data in terms of time lag. In our analysis, autocorrelation is applied to total number of historical data in previous three years (number of effective days is 757 total days). The correlation formula states the number of lags until there is no more correlation. Under this circumstance, autocorrelation formula figures out a new time period, which provides better influence on current data. And we would use this new period for future prediction model.

As the diagram below, this is a autocorrelation plot of Broadcom Limited(AVGO). According to the diagram, last 301 days have strong correlation with current prices.

13 Figure A-2

5.2. Least-Squares Linear Regression ​ Least-Square is widely used to analyze linear regression of data. One important application is within the data fitting. The least-squares minimize the sum of squared difference between fitted line value and observed historical values. In the data fitting, there are linear least square and nonlinear least square. In our project, we simply took the linear least square.

As the diagram below, this is a Least-Squares Linear Regression plot of ​ ​ Broadcom Limited(AVGO). According to the diagram, the Least-Squares Linear ​ ​ Regression of AVGO is y = 0.2*x + 78.

Figure A-4

14

5.3. Fourier Series Fourier series is a function describing infinite cosines and sines. The fourier function is the sum of sine and cosine with different amplitude and harmonically related frequencies. As we can see from the diagram above. The red curve is the fourier of Broadcom prices. In our project, fourier function measures the difference between fitting line and observed historical data. (Cite Y10,Y11)

15

5.4. Relative Strength Index(RSI) ​ Relative Strength Index is a momentum oscillator, with range from 0 to 100. The RSI index measures change and speed of price movement, whether stock is overbought or oversold. When the RSI index is above 70, the stock is overbought; when RSI is below 30, the stock is then oversold.(Cite Y17) Moreover, RSI index can also demonstrate trend in both bull market and bear market. In a bull market, it tends to be in 40 to 90 with support zone from 40 to 50. In contrast, in a bear market the RSI is between the 10 to 60 range with ​ the 50-60 zone acting as resistance.

The formula of RSI: RSI = 100 – [100 / ( 1 + (Average of Upward Price Change / Average of Downward Price Change ) ) ]

Relative Strength Index of Broadcom Limited(AVGO)

16

5.5. Volatility Volatility is a measure of how much a signal is changing, and can be measured by calculating the standard deviation of stock price data over the course of one autocorrelation period. There’s always a little difference between the model and the actual data. To handle this little difference, which is also considered as the violation of the stock price, the team implemented a method to calculate the biggest difference between the model and the actual data. Then place this on the prediction to form an “error band”, which indicates how much the actual future price is going to violate with the prediction. In the last term, the team improved this method to implement a better volatility model with calculating the mean of the difference between the model and the actual data. Several different approaches were implemented, including calculating with maximum and minimum price in one day, closing price and opening price in one day and closing prices of two adjacent days. The basic idea is as follow:

More detailed analysis will be shown in later part of this paper.

17

5.6. Correlation coefficient Generally speaking, stock market is influenced by many different factors. In order to check the relationship among these factors, association or dependence between two or more factors is essential. In statistics, correlation measures whether there is relationship between two random variables. The existence of correlation could be identified by drawing a diagram: if there is clear pattern of trend, correlation exists between two factors. However, we also need to keep in mind that correlation does not necessarily determine one value could be calculated by a function which takes another correlated value as parameter. Correlation just shows a relationship between two factors.

Correlation coefficient , mostly known as “"Pearson's correlation coefficient”, measures the statistical relationship between two values. (Cite Y9)

18

The correlation coefficient formula takes in expected value and standard deviation of variables X, Y. The interval of correlation coefficient is (-1,1). If coefficient is 1, the correlation between two variables is perfect direct (increasing) linear relationship; while if the coefficient is -1, the correlation of two variables turn to be perfectly decreasing (inverse) relationship. And for correlation coefficient equals 0, two variables are independent. The series of graphs below represents correlation in descending order. (Cite picture)

19

6. Methodology 6.1. Model 1:Linear Regression and Fourier Series Analysis model Take Broadcom Limited as example for our first model. First step in model one: Import date and closing price of three year historical data into MATLAB. After the input, we changed date into index and plot the historical data.

Figure A-1

Secondly, our group plot autocorrelation of all the values in historical data by built-in function.

20 According to the Figure A-2, latest 301 days have strong correlation with current prices.

The reason behind relative long correlation period is that AVGO is large company and its stock price is relative stable in the long run.

Figure A-2

After we find out the autocorrelation period of each stock, we want to apply it to predict the future prices. Autocorrelation shows time period which influences latest price. In order to achieve that, only using autocorrelation is not enough. We also utilizes the

FourierThree function. As we mentioned above, FourierThree is the fitted curve about difference between linear fitted curve and actual closing stock prices.

21 For Broadcom Limited, the fitted curve of difference of historical prices and fitted linear curve is shown in A-3. We loaded Broadcom Limited with new length, according to the autocorrelation.

Figure A-3 Generally, we use autocorrelation period plus 18 days(prediction period) as variable in the FourierThree function and then combine the FourierThree with fitted line of prices during the autocorrelation period. To visualize clearly about the 18 days prediction and the relationship between prediction and late stock prices during autocorrelation period, we used “hold on” function to plot the two curves in one single diagram. What’s more, we plot each 18 days predictions and its actual prices to compare difference.

22 To examine whether Linear Regression and Fourier Series Analysis is a good prediction model, we calculated the error band: subtract linear fitted line and fourier, taking 18 days as parameter, from real historical data. This resulted in a column of numbers. After that, chose the maximum among the column of numbers as error band. In order to improve the prediction model, we plot the upper bound limit and the lower bound limit of predicted prices to show the maximum and minimum prediction could be achieved by this model.

This error boundary is calculated based on the autocorrelation period of time, since the prices of stock during this period are positively related to the current price. Therefore the error range is the double maximum error of prediction.

As we can see from the figure A-5, there are two days out of the error band. That means

16 out of 18 days are within prediction.

23 6.2. Model2: Moving Average model Moving average is a normal method to get rid of the influence of the daily violation of the stock price. The curve of the price of a stock after modifying by moving average is smoother than the original one. This helps the investors to get a bigger view of the whole trend of stock price. Moving average help us to get a steadier model, which means we could receive a smaller error band with this method. The basic equation of moving average model is as follow:

If it’s a 10-day moving average model, then n would equal to 10. If it’s a 5-day moving average model, then n equal to 5.

To calculate moving average, we used tsmovavg, which is a built-in function. The tsmovavg takes-in correlation length of closing prices, 10 days as number of previous data points used with the current data point when calculating the moving average. (Cite MATLAB)

24

After calculating the moving average, the prediction step follows Model 1(Linear Regression and Fourier Series Analysis): ● Used correlation length to fit the moving average curve ● Error band is absolute value of maximum difference between actual closing prices and future prediction based on fourier trend. ● Plot future prediction in 18 days, with error band

25 6.3. Model 3: Volatile Model Volatility is a measure of how much a signal is changing, and can be measured by calculating the standard deviation of stock price data over the course of one autocorrelation period. There’s always a little difference between the model and the actual data. To handle this little difference, which is also considered as the violation of the stock price, the team implemented a method to calculate the biggest difference between the model and the actual data. Then place this on the prediction to form an “error band”, which indicates how much the actual future price is going to violate with the prediction. In the next term, the team will improve this method to implement a better volatility model with calculating the mean of the difference between the model and the actual data.

Figure J-1 The purple line stand for the upper boundary of the band, the lower one stand for the lower boundary of the band.

26 In term C, we added several extra methods to try to determine and quantify the influence of volatility on the prediction: a. Volatility 1: difference between closing price and opening price every day To compare volatility of different stocks, we make all volatilities in percentage. In Volatility 1, our team first found out absolute difference between opening price and closing price of each stock within correlation period. After that calculated the standard deviation of difference. Then divided the standard deviation by mean of difference. So far, we calculated the volatility of each stock and put them into an array called Vol1. We used corrcoef() function to calculate the correlation between GoodDay array (about good number in 44 days) and Vol1.

27 b. Volatility 2: difference between consecutive closing prices Volatility 2 measures each stock in difference between consecutive prices. We first found out an array, containing absolute value of consecutive closing prices. Then used standard deviation function to calculate the volatilities. Volatility value equals the standard deviation divided by average price. Vol2 contains volatility 2 in technology sector(before our team added four more stocks). Similarly, we measured the correlation between volatility 2 and good day numbers.

28 c. Volatility 3: difference between max and min prices every day According to correlation period data, there is every day’s max and min prices. Our team first calculated absolute difference between max and min prices, and then found the standard deviation of it. Volatility equals standard deviation divided by average closing price. Therefore, volatility 3 is in percentage now. Last, we calculated the correlation between volatility and good day numbers.

29 d. Volatility 4: difference between max and min in everyday average price Volatility was transformed from Volatility 3. In Volatility 3, difference between max and min was divided by average of closing prices. However, here in Volatility 4, we used everyday closing price, which was in arrays.

Hence, a for loop for each stock was necessary. The for loop returned each day’s ratio of absolute difference and closing price. We wanted to research whether measuring each days volatility is much more accurate.

30

Last, calculated correlation between volatility and good day number.

31 6.4. Model 4: Correlation model In term B, the team discovered that the price of a stock is actually somehow related to the trend of whole market. In order to analyse the relationship between the stock and the whole market, we need to implement a method called “correlation”. This method, which is actually a build-in function of the MATLAB, can calculate the relation ratio between two arrays, and the ratio stands for how much one array is affected by the other. Note that the size of the two arrays needs to be the same. It is a really useful method that helped the team find out the relation between the whole market price (e.g. the Dow Jones Index) and the price of a single stock. Then the team took this ratio into account and calculated the final “correlated-with-market” price of a certain stock.

First, the team calculated the ratio between the stock price and market price by their historical data as follow:

32 Note that the size of the two array should be the same. Then the team calculated the prediction by the ratio and their FourierThree models:

33 7. Results 7.1. Linear Regression and Fourier Series Analysis model 7.1.1. Technology Sector

Figure A-5 As diagram (A-4) shown, it is predicted that in following 18 days Broadcom Limited stock will increase up to less than 260. And diagram of Y-4 shows that 16 out of 18 days are within thee error band. Because of the patterns shown in diagram A-4, the prediction for Broadcom Limited in the 18 days performs well.

34

Figure F-1 The diagram F-1 shows that we predict that during the 18 days, the prices of Facebook will drop. And the diagram F-1 shows that we make a good guess, for there are actual prices falling both above and below the prediction line. During the first 14 days predictions are close to the actual prices; however, on day 15th, there is a big drop in the actual price. In all, the prediction is good.

35

Figure M-1

As the prediction line drawn in diagram M-1, there will be a like flat and little rise perhaps in the following 18 days. In later 7 days the actual prices decrease but remain in the error band. In sum, the prediction is very much accurate because it successfully has all closing prices within prediction boundary.

36

Figure NV-1

In NV-1, it is predicted that stock prices of NVDA will increase about 10 during following 18 days. And according to our figure NV-1, the actual prices of Nvidia stock price increase in 18 days and during that period, the actual prices are mainly above prediction line, except from 10th to 15th days, the disparity between prediction and actual prices is small. Moreover, all actual prices are within error band. Then, we can conclude that this prediction is feasible.

37

Figure S-1

According to our diagram S-1, it is predicted that the stock prices of SAP will decrease from 105 to 100. However, diagram S-1 shows that only 3 out of 18 days are within error band, while the others are all above maximum prediction boundary. And this increasing pattern contradicts our prediction of decrease.

38

Figure T-1 According to prediction in T-1, prices of TXN may experience small increase. And the diagram of T-1 shows much larger increase in actual prices. This prediction is feasible and accurate because 15 days out of 18 days are within error band.

7.1.2. Financial Sector

39

As shown in diagram above, the prediction of the trend of BAC stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the lower part of the error range so I think this prediction could work for this stock. The only issue is that the result would not be that precise.

40

Figure BLK-1 As shown in diagram above, the prediction of the trend of BLK stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the upper part of the error range so I think this prediction could work for this stock, yet the result would not be perfect.

41

Figure V-1 As shown in diagram above, the prediction of the trend of V stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. It seems that the way of prediction works for this one.

42

Figure MS-1 As shown in diagram above, the prediction of the trend of MS stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. It seems that the way of prediction works for this one.

43

Figure H-1 As shown in diagram above, the prediction of the trend of HSBC stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the lower part of the error range so I think this prediction could work for this stock. Although I’m afraid that the result is not be that satisfying, this method is still working for HSBC.

44

Figure J-2 As shown in diagram above, the prediction of the trend of JPM stock price is that it would go up, and it was the real stats operated almost like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the upper part of the error range and in the end it go out of the range. This method works for JPM but only works for a short period of time.

As we can see from the 12 graphs above, SAP stock did not perform well. Only 3 out of 18 is within error band. Therefore, we need some modifications to have more percentage of good days.

45 7.2. Moving Average model 7.2.1. 5-Day Moving Average 7.2.1.1. Technology Sector

Figure A-6

From Figure A-6, 12 out of 18 days are within error band, compared with model 1 prediction on AVGO, moving average in 5 days prediction is not that appropriate.

46

Figure F-2

There are two points out of Facebook prediction. Most of prices of Facebook are still within boundary. In comparison, the original prediction of Facebook is more precise than moving average prediction.

47

Figure M-2

The prediction of Microsoft stock price is accurate, since all points in actual price are within the error band of moving average model.

48

Figure NV-2

Fifteen out of eighteen days prices are between the error band, which means that this prediction is good.

49

Figure S-2

In this figure S-2, 15 out of 18 days prices are with prediction boundary, which is a huge improvement compared with result of model 1. In model 1, only 3 out of 18 days are good days, while in this model good days number increases by 66.7%

50

Figure T-2

Error band of Texas instrument prediction is 3.2219, and 12 out of 18 days are good days. The good day number is decreasing compared with model 1 prediction on Texas Instrument.

51

7.2.1.2. Financial Sector

As you can see, in the diagram above, the prediction of the trend of BAC stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The error band is narrower comparing to the prediction made before.

52

As shown in diagram, the prediction of the trend of BLK stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the upper part of the error range so I think this prediction could work for this stock. The result is still not perfect,since the price went out of band after 18 days.

53

As shown in diagram above, the prediction of the trend of V stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. It seems that the way of prediction works for this one.

54

As shown in diagram above, the prediction of the trend of MS stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. Some of the actual price went out of the band, but it doesn’t matter as long as the grand prediction is good.

55

As shown in diagram above, the prediction of the trend of HSBC stock price is that it would go up, and it was the real stats operated just like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the lower part of the error range and it goes out of band after around 12 days. The good day ratio of this stock is about 60%, so it’s still an acceptable result.

56

As shown in diagram above, the prediction of the trend of JPM stock price is that it would go up, and it was the real stats operated almost like the prediction. There were some fluctuation but, as you can see, there are two error boundaries lying above and below the prediction of the price trend. The fluctuation are almost in that region. The actual price lays in the upper part of the error range and in the end it go out of the range. This method works for JPM but only works for a short period of time (about 10 days).

57 7.2.2. 10-Day Moving Average As we can see from the graphs, average good day number in 5 days moving average is more than that of 10 days moving average. Therefore, we can conclude that 5 days moving average prediction is more accurate than 10 days moving average. Figures of 12 stocks are listed below.

7.2.2.1. Technology Sector

Figure A-7

58

Figure NV-3

Figure F-3

59

Figure S-3

60

Figure T-3

61 7.2.2.2. Financial Sector

62

63

64 7.3. Correlation model 7.3.1. Technology Sector

Figure A-8

All actual prices are with the error band, and percentage of error band is 7.1% of starting price in prediction.

65

Figure F-4

19 out of 22 days are good days and as we can see from the Figure F-4, earlier predictions are all within error band and only last 3 days are not that accurate, which means the prediction model is targeting the actual prices in 19 days period, last 3 days are too far to predict.

66

Figure M-4

As we can see from Figure M-4, although the error band of Microsoft price is too broad for actual prices, prediction is still accurate that all prices are within prediction boundary.

67

Figure NV-4

From Figure NV-4, this prediction is good, and first 19 days float around the prediction line, while last three days drop drastically and yet still above the lower error band.

68

Figure S-4

Only 10 days out of 22 days are good day prediction. SAP stock price is hard to predict and moving average model is the best prediction on SAP.

69

Figure T-4

There are 20 days out of 22 days within error band and the two out of boundary closing prices are still close to what we predicted. This model for Texas Instrument is accurate.

70

7.3.2. Financial Sector

71

72

73

The overall result of this method was not so acceptable. As you can see, although the error band was narrower, the good day ratio also decreased a lot. In term C,we added 4 stocks in each sector. Our team made predictions based on the Correlation Model. Our team had two aims: ● Improve the prediction model ● Expand prediction scope: including companies with both relatively large market capital and small capital.

74

Figure GIB-1

The prediction prices of Consultants to Government and Industries is ranging from 50 to 55. According to the Figure GIB-1, all 44 days closing prices were within the error band, with 100% good day prediction. However, the error band of it is 7.8124, approximately 15% in percentage. This is visible according to the figure shown above.

75

Figure HRS-1

As we can see from Figure HRS-1, the prediction of Harris Corporation worked poor within 44 days period. Only 9 out of 44 days are within boundary. The rest of actual closing prices are far higher than what we predicted.

76

Figure IPGP-1

Prediction on IPG Photonics Corporation had 26 out of 44 days within boundary. The percentage of good day number is 59%, error band is approximate 10%. The first half of days are within prediction, while the rest are above highest prediction.

77

Figure NTAP-1

The prediction in NetApp Inc. is not bad. All 44 days are within prediction error band, even though all actual prediction prices are above the prediction line.

78

79

As we can see from prediction on four stocks added later: Consultants to Government and Industries,Harris Corporation, IPG Photonics Corporation, NetApp Inc, the prediction did not perform well enough as we anticipated before we implemented the Correlation Model. Partly because, these four stocks have one common attribute: their capital is between 10 and 20 billion dollars. These four stocks’ prices fluctuated more and currently grew faster than predicted.

80 7.4. Tables 7.4.1. Linear Regression and Fourier theories model

Table 1 Table 1 describes the Good day number and error band with their corresponding percentages of Linear Regression and Fourier Series Analysis model. For each stock good day percentage, SAP prediction is out of expectation. What’s more, the error band percentage is too large: Nvidia Corporation is larger than 10%. And the average of all 12 stocks good day number has a 4.03 standard deviation. We think the our data could be modified by using moving average. The average good day number for this model is 16.08 days and average of error band 5.10%.

81

7.4.2. Moving average model

Table 2

The moving average will take average of every 5 or 10 days, by doing so the new model may have less out of boundary values. Therefore, we expect to decrease the standard deviation. From the result of Table 2, we can see that SAP has much larger good days number, while MOrgan Stanley, HSBC and JPMorgan have small good day number in 10 days moving average model. In general, 5 days moving average enables each of stocks has equal to 9 or larger good day number, though the width of error band in 5 days moving average model is larger than that of 10 days moving average model in percentage. The average good day is 11.75 out of 18, average error band is 4.02% for 5 days moving average model. We think we could have more accurate model to improve prediction.

82 7.4.3. Correlation model

Table 3

Table 4

Both Table 3 and Table 4 use the same model to predict stock price: Correlation model. The two table use different historical data. In contrast, Table 3 is based upon historical data November 2014 to November 2017 and predicts 22 days in total. Bank of America, HSBC Holding, and JPMorgan have good day numbers each below 10 out of 22 days in total. And

83 error band of 4 stocks are above 5%.The average good day number in Table 3 is 15.42 out of 22. Table 4 is based upon historical data from September 2014 to September 2017 and makes prediction of 44 days. Good day number is 32.08 out of 44 in Table 4. And the error band percentage is smaller than before. The lowest good day number percentage is 38.6%, better than in Table 3, while the largest error band percentage is 21.5%.

7.4.4. Volatility model In order to determine the reason of fluctuation in our prediction results, the team decided to build models for volatility to find the relation between volatility of a stock price and the prediction of it.

Table 5

To get a better picture and analysis of our team’s prediction model, we implemented volatility and tried to find if there was correlation between good day numbers and corresponding volatilities.

84 As we have mentioned in Methodology, Volatility 1(shown as Vol1) measured changes in difference between everyday opening and closing prices. Volatility 2(shown as Vol2) measured changes of consecutive closing prices. Volatility 3(shown as Vol3) measured changes of everyday difference between max and min prices. All volatility is in percentage, because in that way we were able to compare stocks in the same level. The volatility method was a trial. And we can see from Table 5, there was no certain correlation between every volatility and good day numbers. When we look at volatility in sectors, volatilities are above 0.5 in technology sector, while Vol2 and Vol3 in finance sector are above 0.3 and above 0.5 in Vol1. In combination, correlation between good day numbers and volatilities in 12 stocks is approximate 0.4, which means we cannot associate good day number with volatility.

Table 6

In order to improve the volatility model in Table 6, Volatility 4 is implemented. Volatility 4 is shown as vol4, which measured the difference between max and min prices of each day. Unfortunately, there is still no strong correlation between good day numbers and volatility.

85

8. Discussion 8.1. Term A During the research in term A, the team discovered that the RSI(Relative Strength Index) wasn’t really helpful for predicting the price of a stock, since it only supplies a real-time indication of whether it’s going to go up or not. More importantly, RSI is not always accurate; thus, the team decided to implement other models instead of RSI.

Then, the team found that Linear Regression and Fourier Series are good ways to modelize the prediction of the price of a stock.

Overall, the FourierThree model was a pretty accurate base model used in Term A, although the error band of it was still pretty large and needs to be improved.

8.2. Term B In term B, the team decided to consider the “market influence” on stock prices. It was done by calculating the relation between the market index (e.g. Dow Jones Index) and price of a single stock. In order to do this, the team implemented a method called “correlation coefficient”.

This method helped the team to discover the relation between the market price and price of a single stock.

8.3. Term C In term C, the team researched the reasons behind the inaccurate prediction. One possible reason was volatility. To check if volatility could determine prediction accuracy, 4 different approaches to volatility calculation were implemented. What’s more, our team also calculated correlation between good day number and different volatility.

86

9. Recommendation As we can see from results of prediction analysis, some predictions are not accurate enough, especially in small capital stocks and some of the finance sector. Therefore, we would like to give several pieces of recommendation for future research: ● Apply different prediction models for large capital and small capital stocks.

● Take in more related market information, make models more market-sensitive

● Discuss with professional people, and expand background research

From our research results, in A and B term our model mainly focused on relatively large

capital stocks; all of them were above 85 billion dollars. In C term, we took in small

capital stocks less than 20 billion dollars. In comparison, the large capital is more than 4 ​ ​ times the size of the small ones. In addition, small stocks have faster growth rates, so

the trend should be more variable. Therefore, utilizing one single prediction model is not

appropriate. Our team would recommend dividing future prediction based on stock

capital, utilizing current prediction models for large capital stocks and more stable

stocks, and improving the prediction models to make them more market sensitive. To

achieve that goal, our team recommends that GDP, government market policy,

employment rate, and inflation rate all should be considered. Our recommendation

includes inquiry towards professionals: for instance, economics professors. Predictions

should not only be based on historical data, but also include diverse analysis of

background knowledge.

87

10. Conclusion Over the course of this project, the team implemented methodology of signal analysis techniques and historical data about the stock market to implement a fully-functioning financial tool to provide investors with financial insight for proper decisions. This system made informed decisions using input from historical stock price data, historical stock exchange index values, trading volume, research into company news presence, and the opinion of market analysts. Several rounds of testing with virtual investment portfolios yielded mixed results, with a general trend of the system producing profitable outcomes for short-term investments. Though the price of a stock is often considered to be a seemingly random signal, with countless factors determining its future outcomes, the team has developed a system which is able to provide positive outcomes for even the most inexperienced of investors. Eight stocks were utilized as test cases of the predicting system. Volatility of individual stock and different sectors were also considered, and results showed that there’s no strong correlation between good days prediction number and volatility.

88

11. Bibliography 11.1. Citations from books 11.1.1. Predict Market Swings With Technical Analysis (Wiley Trading) 1st Edition by Michael McDonald 11.1.2. Stock Market Probability: Using Statistics to Predict and Optimize Investment Outcomes, Revised Edition 2nd Edition by Joseph E. Murphy

11.2. Citations for Financial Sector 11.2.1. J1: Bank of America. (2017, September 21). Retrieved September 23, 2017, from https://en.wikipedia.org/wiki/Bank_of_America ​ 11.2.2. J2: BlackRock. (2017, September 21). Retrieved September 23, 2017, ​ from https://en.wikipedia.org/wiki/BlackRock ​ 11.2.3. J3: HSBC. (2017, September 23). Retrieved September 23, 2017, from https://en.wikipedia.org/wiki/HSBC 11.2.4. J4: JPMorgan Chase. (2017, September 23). Retrieved September 23, 2017, from https://en.wikipedia.org/wiki/JPMorgan_Chase ​ 11.2.5. J5: Visa Inc. (2017, September 24). Retrieved September 24, 2017, from https://en.wikipedia.org/wiki/Visa_Inc. ​ 11.2.6. J6: Morgan Stanley. (2017, September 23). Retrieved September 24, 2017, from https://en.wikipedia.org/wiki/Morgan_Stanley ​ 11.2.7. J7: 2015–16 stock market selloff. (2017, September 08). Retrieved September 24, 2017, from https://en.wikipedia.org/wiki/2015%E2%80%9316_stock_market_selloff 11.2.8. J8: Berkshire Hathaway. (2018, February 09). Retrieved February 11, 2018, from https://en.wikipedia.org/wiki/Berkshire_Hathaway 11.2.9. J9: Boston Properties. (2018, February 10). Retrieved February 11, 2018, from https://en.wikipedia.org/wiki/Boston_Properties 11.2.10. J10: Alexandria Real Estate Equities. (2018, February 10). Retrieved February 11, 2018, from https://en.wikipedia.org/wiki/Alexandria_Real_Estate_Equities

89 11.2.11. J11: Nasdaq, Inc. (2018, February 11). Retrieved February 11, 2018, from https://en.wikipedia.org/wiki/Nasdaq,_Inc. ​

11.3. Citations for Financial Sector 11.3.1. Y1: AVGO Key Statistics | Broadcom Limited - Ordinary Sha Stock. ​ (2017, September 24). Retrieved September 24, 2017, from

https://finance.yahoo.com/quote/AVGO/key-statistics?p=AVGO

11.3.2. Y2: Facebook. (2017, September 23). Retrieved September 24, 2017,

from https://en.wikipedia.org/wiki/Facebook ​ ​ 11.3.3. Y3: MSFT Key Statistics | Microsoft Corporation Stock. (2017,

September 22). Retrieved September 24, 2017, from

https://finance.yahoo.com/quote/MSFT/key-statistics?p=MSFT

11.3.4. Y4: Microsoft. (2017, September 24). Retrieved September 24, 2017,

from https://en.wikipedia.org/wiki/Microsoft ​ ​ 11.3.5. Y5: Nvidia. (2017, September 21). Retrieved September 24, 2017, from

https://en.wikipedia.org/wiki/Nvidia

11.3.6. Y6: SAP SE. (2017, September 21). Retrieved September 24, 2017, from

https://en.wikipedia.org/wiki/SAP_SE

11.3.7. Y7: Texas Instruments. (2017, September 21). Retrieved September 24,

2017, from https://en.wikipedia.org/wiki/Texas_Instruments ​ ​ 11.3.8. Y8: 2015–16 stock market selloff. (2017, September 08). Retrieved

September 24, 2017, from

https://en.wikipedia.org/wiki/2015%E2%80%9316_stock_market_selloff

11.3.9. Y9: Correlation and dependence. (2017, December 14). Retrieved

December 18, 2017, from

https://en.wikipedia.org/wiki/Correlation_and_dependence

11.3.10. Y10: http://mathworld.wolfram.com/FourierSeries.htm ​

90 11.3.11. Y11: https://en.wikipedia.org/wiki/Fourier_series l ​ ​ ​ 11.3.12. Y12: CGI Group. (2018, February 9). In Wikipedia, The Free ​ ​ Encyclopedia. Retrieved 17:32, February 10, 2018, from ​ https://en.wikipedia.org/w/index.php?title=CGI_Group&oldid=824734028

11.3.13. Y13: HRS : Summary for Harris Corporation. (2018, February 10). ​ Retrieved February 10, 2018, from

https://finance.yahoo.com/quote/HRS?p=HRS

11.3.14. Y14: IPGP Key Statistics | IPG Photonics Corporation Stock. (2018,

February 10). Retrieved February 10, 2018, from

https://finance.yahoo.com/quote/IPGP/key-statistics?p=IPGP

11.3.15. Y15: NetApp. (2018, February 9). In Wikipedia, The Free Encyclopedia. ​ ​ Retrieved 18:04, February 10, 2018, from

https://en.wikipedia.org/w/index.php?title=NetApp&oldid=824843173

11.3.16. Y16: NTAP : Summary for NetApp, Inc. (2018, February 10). Retrieved ​ February 10, 2018, from https://finance.yahoo.com/quote/NTAP?p=NTAP ​ 11.3.17. http://www.tubematter.com/wp-content/uploads/2015/04/Correlation-coeffi cient-examples.png (picture) ​ 11.3.18. Y17: Relative Strength Indicator (RSI). (n.d.). Retrieved February 10,

2018, from

https://www.fidelity.com/learning-center/trading-investing/technical-analysi

s/technical-indicator-guide/rsi ​

91