Using News Articles to Predict Stock Price Returns
Total Page:16
File Type:pdf, Size:1020Kb
Using News Articles to Predict Stock Price Returns Ricardo Herrmann Rodrigo Togneri [email protected] [email protected] Luciano Tozato Wei Lin [email protected] [email protected] Knowledge Sharing Article © 2017 Dell Inc. or its subsidiaries. Table of Contents Overview ....................................................................................................................... 3 Introduction ................................................................................................................... 4 Related Work ............................................................................................................. 6 Deep CNN model for text classification ......................................................................... 6 Empirical evaluation ...................................................................................................... 9 Data sources ........................................................................................................... 10 Stock prices ......................................................................................................... 10 News articles ........................................................................................................ 10 Pre-trained word embeddings .............................................................................. 11 Results .................................................................................................................... 12 Conclusion .................................................................................................................. 14 References .................................................................................................................. 15 Disclaimer: The views, processes or methodologies published in this article are those of the authors. They do not necessarily reflect Dell EMC’s views, processes or methodologies. 2017 Dell EMC Proven Professional Knowledge Sharing 2 Overview The stock market can be seen as a dynamic system where stock prices are affected by the behavior of traders trying to buy or sell stocks, based on the most recent information they can get. There’s a bi-directional relationship between stock prices and news articles, with a variable lag between effects on both of them, as news influence traders’ behavior and vice-versa. The collective sentiment may push stock prices up (the so-called “Bull Effect”) – even creating economic bubbles – or down (the “Bear Effect”). Our proposal measures how many changes in stock market prices we can predict with a simple model by combining both structured (stock price time series) and unstructured (news articles) data, based solely on automatically extracted text features, to infer the effect of news on the prices of stocks they refer to. To deal with unstructured textual sources of data, we rely on recent advances in Deep Learning applied to Natural Language Processing (NLP), with a particular focus on algorithms that use the Vector Space Model (VSM) of semantics, so that we can work with a uniform representation of words, phrases and documents as numeric vectors. In our implementation, we take a declarative approach to describing numeric computation that can be efficiently executed in GPUs, using the Keras library and the TensorFlow or Theano engines. 2017 Dell EMC Proven Professional Knowledge Sharing 3 Introduction In a stock market, stocks, which are contracts that represent fractional ownership of companies’ shared value, are traded between buyers and sellers. The stocks’ prices fluctuate according to the dynamics of supply and demand, where prices approach an unknown, non-stationary, equilibrium. Although the Efficient Market Hypothesis (EMH) [1] establishes that stocks are always traded at their fair values, price fluctuations are not completely random in practice. To get better returns out of their financial assets, traders can exploit the field of Technical Analysis (TA) which is comprised of techniques for identifying patterns. In a simplified view, technical analysis uses past information about the market to forecast the direction of the market in the short-term future. Quantitative traders rely on this mostly-numeric past information, deriving metrics which use, among other information, stocks’ prices and trading volume. On the other hand, Qualitative trading is subjective, relying on human judgment, and takes into account a multitude of sources of information. Qualitative traders, or fundamentalists, heavily rely on information from financial news. Traditionally, the sources of information handled by computers are mostly structured, and thus computers play a much bigger role in quantitative trading. However, recent advances in Natural Language Processing (NLP) make it easier to process news sources and extract metrics from unstructured data, enabling computers to also be used to empower traders in their qualitative analyses. Sentence classification is a well-studied problem in NLP, where the task is, given a set of pairs comprised of a source text and the class to which it belongs, to correctly classify previously unseen sentences or documents into a fixed set of classes. In Machine Learning [2] nomenclature, it is an instance of a supervised learning problem. With that in mind, our hypothesis is that there is some textual information from financial news that can be automatically extracted using NLP techniques and use that information to predict if particular news articles will push stock prices up or down as a supervised learning problem. In our experiment, we first captured a dataset of financial news, along with their publishing dates and related stock symbols. We then fetched a dataset with historical price series of the corresponding stocks. We combined the information sources at hand and created a dataset for training a learning algorithm to infer the movement of prices on the following day based on the text of news from the preceding day, treating the sentence classification into three 2017 Dell EMC Proven Professional Knowledge Sharing 4 classes related to the price movement: negative, neutral and positive. Our model relies on modern advances in NLP, namely word vectors, convolutional features, Rectified Linear Units (ReLUs) and a deep classification neural network. Related Work It has already been shown [3] that, in the same time period, the number of mentions of a company in the Financial Times and the transaction volume of a company’s stock are correlated, and so is the absolute return and the interest in a company in the news. Their study also shows there’s no statistically significant correlation when the direction of the price is taken into account. This evidence, however, just considers the number of mentions, but ignores the context in which companies are mentioned. Regarding recent work in learning vector representations of words using neural networks, we can cite the work of Bengio et al. [4] (which establishes a Vector Space Model (VSM) for learning to predict the next word), Collobert & Weston [5, 6] (which uses multitask learning for language processing predictions using Convolutional Neural Networks (CNNs)), Mnih & Hinton [7] (a hierarchical distributed language model), Turian et al. [8] (uses semi-supervised learning for combining different word representations), Mikolov et al. [9] (the word2vec model) and Pennington et al. [10] (GloVe text vectors). Extending the same line of work to representations of sentences, we can cite Yessenalina & Cardie [11] (which models each word as a matrix and combines words using iterated matrix multiplication), Grefen- stette et al. [12] (establishes tensor-based compositional distributional semantics of words and func- tions), Le & Mikolov [13] (the doc2vec model) and Kim [14] (uses CNNs for sentence classification). Other related applications of NLP to the stock market include the work of Fehrer [15], which aims to predict the direction of stock movements following financial disclosures and shows how deep learning (more precisely, using a recursive autoencoder [16]) can outperform the accuracy of random forests by 5.66%, using a dataset comprised of 8.359 headlines. Since financial news articles are not produced in large quantities, Quid [17] uses a CNN architecture similar to ours to classify small datasets of text containing company descriptions into “good” or “bad” quality classes. Also in a text-based, but a different approach to the problem, Ding et al. [18] use a deep neural network and information extraction methods to obtain action-actor-object- timestamp tuples using an existing dependency parser [19]. 2017 Dell EMC Proven Professional Knowledge Sharing 5 Deep CNN model for text classification In machine learning, the input information is described to the computer as a set of features. There are many approaches to text classification, but they can be divided into two main categories, regarding their use of features (as is the case in other areas of machine learning): feature engineering and representation learning. The former relies on feature extractors built by subject matter experts, which are usually highly specific to the task at hand. The latter incorporates, in the learning model, some sort of generic representation of feature extractors and treats the values of their parameters as part of the numeric optimization problem called learning, and thus the algorithm “learns” how to build good features. Both strategies aim at deriving semantic features, relevant to the task, from the