An Application of Deep Reinforcement Learning to Algorithmic Trading
Total Page:16
File Type:pdf, Size:1020Kb
An Application of Deep Reinforcement Learning to Algorithmic Trading Thibaut Th´eatea,∗, Damien Ernsta aMontefiore Institute, University of Li`ege(All´eede la d´ecouverte 10, 4000 Li`ege,Belgium) Abstract This scientific research paper presents an innovative approach based on deep reinforcement learning (DRL) to solve the algorithmic trading problem of determining the optimal trading position at any point in time during a trading activity in the stock market. It proposes a novel DRL trading policy so as to maximise the resulting Sharpe ratio performance indicator on a broad range of stock markets. Denominated the Trading Deep Q-Network algorithm (TDQN), this new DRL approach is inspired from the popular DQN algorithm and significantly adapted to the specific algorithmic trading problem at hand. The training of the resulting reinforcement learning (RL) agent is entirely based on the generation of artificial trajectories from a limited set of stock market historical data. In order to objectively assess the performance of trading strategies, the research paper also proposes a novel, more rigorous performance assessment methodology. Following this new performance assessment approach, promising results are reported for the TDQN algorithm. Keywords: Artificial intelligence, deep reinforcement learning, algorithmic trading, trading policy. 1. Introduction expected to revolutionise the way many decision-making problems related to the financial sector are addressed, in- For the past few years, the interest in artificial intelli- cluding the problems related to trading, investment, risk gence (AI) has grown at a very fast pace, with numerous management, portfolio management, fraud detection and research papers published every year. A key element for financial advising, to cite a few. Such complex decision- this growing interest is related to the impressive successes making problems are extremely complex to solve as they of deep learning (DL) techniques which are based on deep generally have a sequential nature and are highly stochas- neural networks (DNN) - mathematical models directly in- tic, with an environment partially observable and poten- spired by the human brain structure. These specific tech- tially adversarial. In particular, algorithmic trading, which niques are nowadays the state of the art in many appli- is a key sector of the FinTech industry, presents particu- cations such as speech recognition, image classification or larly interesting challenges. Also called quantitative trad- natural language processing. In parallel to DL, another ing, algorithmic trading is the methodology to trade using field of research has recently gained much more attention computers and a specific set of mathematical rules. from the research community: deep reinforcement learn- ing (DRL). This family of techniques is concerned with The main objective of this research paper is to an- the learning process of an intelligent agent (i) interacting swer the following question: how to design a novel trad- in a sequential manner with an unknown environment (ii) ing policy (algorithm) based on AI techniques that could aiming to maximise its cumulative rewards and (iii) us- compete with the popular algorithmic trading strategies ing DL techniques to generalise the information acquired widely adopted in practice? To answer this question, this from the interaction with the environment. The many re- scientific article presents and analyses a novel DRL solu- cent successes of DRL techniques highlight their ability to tion to tackle the algorithmic trading problem of deter- arXiv:2004.06627v3 [q-fin.TR] 9 Oct 2020 solve complex sequential decision-making problems. mining the optimal trading position (long or short) at any point in time during a trading activity in the stock market. Nowadays, an emerging industry which is growing ex- The algorithmic solution presented in this research paper tremely fast is the financial technology industry, generally is inspired by the popular Deep Q-Network (DQN) algo- referred to by the abbreviation FinTech. The objective of rithm, which has been adapted to the particular sequential FinTech is pretty simple: to extensively take advantage decision-making problem at hand. The research question of technology in order to innovate and improve activities to be answered is all the more relevant as the trading envi- in finance. In the coming years, the FinTech industry is ronment presents very different characteristics from those which have already been successfully solved by DRL ap- proaches, mainly significant stochasticity and extremely ∗Corresponding author. Email addresses: [email protected] (Thibaut poor observability. Th´eate), [email protected] (Damien Ernst) The scientific research paper is structured as follows. et al. (2017) which introduced the fuzzy recurrent deep First of all, a brief review of the scientific literature around neural network structure to obtain a technical-indicator- the algorithmic trading field and its main AI-based contri- free trading system taking advantage of fuzzy learning to butions is presented in Section 2. Afterwards, Section 3 in- reduce the time series uncertainty. One can also mention troduces and rigorously formalises the particular algorith- Carapu¸coet al. (2018) which studied the application of the mic trading problem considered. Additionally, this section deep Q-learning algorithm for trading in foreign exchange makes the link with the reinforcement learning (RL) ap- markets. Finally, there exist a few interesting works study- proach. Then, Section 4 covers the complete design of the ing the application of DRL techniques to algorithmic trad- TDQN trading strategy based on DRL concepts. Subse- ing in specific markets, such as in the field of energy, see quently, Section 5 proposes a novel methodology to objec- e.g. the article Boukas et al. (2020). tively assess the performance of trading strategies. Section 6 is concerned with the presentation and discussion of the To finish with this short literature review, a sensi- results achieved by the TDQN trading strategy. To end tive problem in the scientific literature is the tendency to this research paper, Section 7 discusses interesting leads prioritise the communication of good results or findings, as future work and draws meaningful conclusions. sometimes at the cost of a proper scientific approach with objective criticism. Going even further, Ioannidis (2005) 2. Literature review even states that most published research findings in cer- tain sensitive fields are probably false. Such concern ap- To begin this brief literature review, two facts have to pears to be all the more relevant in the field of financial be emphasised. Firstly, it is important to be aware that sciences, especially when the subject directly relates to many sound scientific works in the field of algorithmic trad- trading activities. Indeed, Bailey et al. (2014) claims that ing are not publicly available. As explained in Li (2017), many scientific publications in finance suffer from a lack due to the huge amount of money at stake, private FinTech of a proper scientific approach, instead getting closer to firms are very unlikely to make their latest research results pseudo-mathematics and financial charlatanism than rig- public. Secondly, it should be acknowledged that making a orous sciences. Aware of these concerning tendencies, the fair comparison between trading strategies is a challenging present research paper intends to deliver an unbiased sci- task, due to the lack of a common, well-established frame- entific evaluation of the novel DRL algorithm proposed. work to properly evaluate their performance. Instead, the authors generally define their own framework with their 3. Algorithmic trading problem formalisation evident bias. Another major problem is related to the trading costs which are variously defined or even omitted. In this section, the sequential decision-making algorith- mic trading problem studied in this research paper is pre- First of all, most of the works in algorithmic trading are sented in detail. Moreover, a rigorous formalisation of this techniques developed by mathematicians, economists and particular problem is performed. Additionally, the link traders who do not exploit AI. Typical examples of clas- with the RL formalism is highlighted. sical trading strategies are the trend following and mean reversion strategies, which are covered in detail in Chan 3.1. Algorithmic trading (2009), Chan (2013) and Narang (2009). Then, the major- Algorithmic trading, also called quantitative trading, ity of works applying machine learning (ML) techniques in is a subfield of finance, which can be viewed as the ap- the algorithmic trading field focus on forecasting. If the proach of automatically making trading decisions based on financial market evolution is known in advance with a rea- a set of mathematical rules computed by a machine. This sonable level of confidence, the optimal trading decisions commonly accepted definition is adopted in this research can easily be computed. Following this approach, DL tech- paper, although other definitions exist in the literature. niques have already been investigated with good results, Indeed, several authors differentiate the trading decisions see e.g. Ar´evalo et al. (2016) introducing a trading strat- (quantitative trading) from the actual trading execution egy based on a DNN, and especially Bao et al. (2017) using (algorithmic trading). For the sake of generality, algo- wavelet transforms, stacked autoencoders and long short- rithmic trading and