Quantnet: Transferring Learning Across Trading Strategies
Total Page:16
File Type:pdf, Size:1020Kb
QuantNet: Transferring Learning Across Trading Strategies Adriano Koshiyama,∗ Stefano B. Blumberg, Nick Firoozye and Philip Treleaven University College London {adriano.koshiyama.15, stefano.blumberg.17, n.firoozye, p.treleaven}@ucl.ac.uk Sebastian Flennerhag University of Manchester [email protected] Abstract Systematic financial trading strategies account for over 80% of trade volume in equities and a large chunk of the foreign exchange market. In spite of the availability of data from multiple markets, current approaches in trading rely mainly on learning trading strategies per individual market. In this paper, we take a step towards developing fully end-to-end global trading strategies that leverage systematic trends to produce superior market-specific trading strategies. We introduce QuantNet: an architecture that learns market-agnostic trends and use these to learn superior market-specific trading strategies. Each market-specific model is composed of an encoder-decoder pair. The encoder transforms market-specific data into an abstract latent representation that is processed by a global model shared by all markets, while the decoder learns a market-specific trading strategy based on both local and global information from the market-specific encoder and the global model. QuantNet uses recent advances in transfer and meta-learning, where market-specific parameters are free to specialize on the problem at hand, whilst market-agnostic parameters are driven to capture signals from all markets. By integrating over idiosyncratic market data we can learn general transferable dynamics, avoiding the problem of overfitting to produce strategies with superior returns. We evaluate QuantNet on historical data across 3103 assets in 58 global equity markets. Against the top performing baseline, QuantNet yielded 51% higher Sharpe and 69% Calmar ratios. In addition we show the benefits of our approach over the non-transfer learning variant, with improvements of 15% and 41% in Sharpe and Calmar ratios. Code available in appendix. arXiv:2004.03445v2 [cs.LG] 30 Jun 2020 1 Introduction Systematic financial trading strategies account for over 80% of trade volume in equities, a large chunk of the foreign exchange market, and are responsible to risk manage approximately $500bn in assets under management [4, 53]. High-frequency trading firms and e-trading desks in investment banks use many trading strategies, ranging from simple moving-averages and rule-based systems to more recent machine learning-based models [49, 22, 71, 90, 59, 50, 46]. Despite the availability of data from multiple markets, current approaches in trading strategies rely mainly on strategies that treat the relationships between different markets separately [42, 12, 49, 22, 102, 59, 46]. By considering each market in isolation, they fail to capture inter-market dependencies ∗Corresponding author Preprint. Under review. Figure 1: QuantNet workflow: from market data to decoding/signal generation. [57, 78] like contagion effects and global macro-economic conditions that are crucial to accurately capturing market movements that allow us to develop robust market trading strategies. Furthermore, treating each market as an independent problem prevents effective use of machine learning since data scarcity will cause models to overfit before learning useful trading strategies [82, 47, 22, 59]. Commonly-used techniques in machine learning such as transfer learning [15, 74, 101, 13], and multi-task learning [15, 43, 13] could be used to handle information from multiple markets. However, combining these techniques is not immediately evident because these approaches often presume one task (market) as the main task and while others are auxiliary. When faced with several equally essential tasks, a key problem is how to assign weights to each market loss when constructing a multi- task objective. In our approach to end-to-end learning of global trading strategies, each market carries equal weight. This poses a challenge because (a) markets are non-homogeneous (e.g., size, trading days) and can cause interference during learning (e.g., errors from one market dominating others); (b) the learning problem grows in complexity with each market, necessitating larger models that often suffer from overfitting [27, 48] which is a notable problem in financial strategies [82, 47, 22, 59]. In this paper, we take a step towards overcoming these challenges and develop a full end-to-end learning system for global financial trading. We introduce QuantNet: an architecture that learns market-agnostic trends and uses them to learn superior market-specific trading strategies. Each market-specific model is composed of an encoder-decoder pair (Figure 1). The encoder transforms market-specific data into an abstract latent representation that is processed by a global model shared by all markets, while the decoder learns a trading strategy based on the processed latent code returned by the global model. QuantNet leverages recent insights from transfer and meta-learning that suggest market-specific model components benefit from having separate parameters while being constrained by conditioning on an abstract global representation [84]. Furthermore, by incorporating multiple losses into a single network, our approach increases network regularization and avoids overfitting, as in [62, 95, 13]. We evaluate QuantNet on historical data across 3103 assets in 58 global equity markets. Against the best performing baseline Cross-sectional momentum [54, 9], QuantNet yields 51% higher Sharpe and 69% Calmar ratios. Also, we show the benefits of our approach, which yields improvements of 15% and 41% in Sharpe and Calmar ratios, respectively, over the comparable non-transfer learning variant. Our key contributions are: (i) a novel architecture for transfer learning across financial trading strategies; (ii) a novel learning objective to facilitate end-to-end training of trading strategies; and (iii) demonstrate that QuantNet can achieve significant improvements across global markets with an end-to-end learning system. To the best of our knowledge, this is the first paper that studies transfer learning as a means of improving end-to-end large scale learning of trading strategies. 2 2 Related Work Trading Strategies with Machine Learning Machine learning-based trading strategies have pre- viously been explored in the setting of supervised learning [49, 2, 35, 85, 46] and reinforcement learning [76, 66, 40, 104], nowadays with a major emphasis on deep learning methods. Broadly, these works differ from our proposed method as they do not use inter-market knowledge transfer and (in the case of methods based on supervised learning) tend to forecast returns/prices rather than to generate end-to-end trading signals. We provide an in-depth treatment in Section 3.1. Transfer Learning Transfer learning is a well-established method in machine learning [15, 74, 101]. It has been used in computer vision, medicine, and natural language processing [14, 58, 79]. In financial systems, this paradigm has primarily been studied in the context of applying unstructured data, such as social media, to financial predictions [3, 50]. In a few occasions such methodologies have been applied to trading, usually combined with reinforcement learning and to a very limited pool of assets [55]. We provide a thorough review of the area in appendix D. The simplest and most common form of transfer learning pre-trains a model on a large dataset, hoping that the pre-trained model can be fine-tuned to a new task or domain at a later stage [23, 67]. While simple, this form of knowledge transfer assumes new tasks are similar to previous tasks, which can easily fail in financial trading where markets differ substantially. Our method instead relies on multi-task transfer learning [16, 8, 15, 83]. While this approach has previously been explored in a financial context [42, 12], prior works either use full parameter sharing or share all but the final layer of relatively simple models. In contrast, we introduce a novel architecture that relies on encoding market-specific data into representations that pass through a global bottleneck for knowledge transfer. We provide a detailed discussion in Section 3.2. 3 QuantNet We begin by reviewing end-to-end learning of financial trading in section 3.1 and relevant forms of transfer learning in section 3.2. We present our proposed architecture in section 3.3. 3.1 Preliminaries: Learning Trading Strategies A financial market M = (a1; : : : ; an) consists of a set of n assets aj; at each discrete time step t 1 n n we have access to a vector of excess returns rt = (rt ; : : : ; rt ) 2 R . The goal of a trading strategy f, parametrized by θ, is to map elements of a history Rm:t = (rt−m;:::; rt) into a set of trading 1 n n signals st = (st ; : : : ; st ) 2 R ; st = fθ(Rm:t). These signals constitute a market portfolio: a trader j j j would buy one unit of an asset if st = 1, sell one unit if st = −1, and close a position if st = 0; any value in between (−1; 1) implies that the trader is holding/shorting a fraction of an asset. The goal is to produce a sequence of signals that maximize risk-adjusted returns. The most common approach is n×n to model f as a moving average parametrized by weights θ = (W1;:::;Wm), Wi 2 R , where weights typically decreases exponentially or are hand-engineered and remain static [7, 4, 34]; t ! ma ma X n n st = τ(fθ (Rt−m:t)) = τ Wkrk ; τ : R ! [−1; 1] : (1) k=t−m More advanced models rely on recurrence to form an abstract representation of the history up to time t; either by using Kalman filtering, Hidden Markov Models (HMM), or Recurrent Neural Networks (RNNs) [102, 35]. For our purposes, having an abstract representation of the history will be crucial to facilitate effective knowledge transfer, as it captures each market’s dynamics, thereby allowing QuantNet to disentangle general and idiosyncratic patterns.