Learning Unfair Trading: a Market Manipulation Analysis from the Reinforcement Learning Perspective
Total Page:16
File Type:pdf, Size:1020Kb
Learning Unfair Trading: a Market Manipulation Analysis From the Reinforcement Learning Perspective Enrique Mart´ınez-Miranda and Peter McBurney and Matthew J. Howard Department of Informatics King’s College London fenrique.martinez miranda,peter.mcburney,[email protected] Abstract have names like ramping, wash trading, quote stuffing, lay- ering, spoofing, among others. Spoofing is one of the most Market manipulation is a strategy used by traders to popular strategies that uses non-bona fide orders to improve alter the price of financial securities. One type of ma- nipulation is based on the process of buying or selling the price and is considered illegal by market regulators (Ak- assets by using several trading strategies, among them tas 2013). A similar strategy used by high-frequency traders spoofing is a popular strategy and is considered illegal (HFTs) is called pinging where HFTs place orders without by market regulators. Some promising tools have been the intention of execution, but to find liquidity not displayed developed to detect manipulation, but cases can still be in the order book (where all buy and sell orders are listed found in the markets. In this paper we model spoofing in double auction markets), and has caused controversy as it and pinging trading, two strategies that differ in the le- can be viewed as a manipulative strategy (Scopino 2015). gal background but share the same elemental concept of Studies found in the literature that analyse the problem market manipulation. We use a reinforcement learning of market manipulation have mainly focused on the devel- framework within the full and partial observability of Markov decision processes and analyse the underlying opment of methods for detection. However, there has been behaviour of the manipulators by finding the causes of little analysis on the behaviour of market manipulators, an what encourages the traders to perform fraudulent ac- area that may reveal the cause of why these economic agents tivities. This reveals procedures to counter the problem take such actions, thus examining this might be helpful for that may be helpful to market regulators as our model market regulators to develop counter-measures that may dis- predicts the activity of spoofers. courage or preclude fraudulent strategies We propose to model spoofing and pinging strategies in 1 Introduction the context of portfolio growth maximisation, i.e., the ex- pected capital appreciation over time of an investment ac- Market microstructure is a branch of finance concerned with count. We use a reinforcement learning agent that simu- analysis of the trading process arising from the exchange of lates the behaviour of the spoofing trader in the context of assets under a given set of rules (O’Hara 1998). In double Markov decision processes (MDP), while a partially observ- auction markets, this exchange of assets is done when the able MDP is used to model the pinging trader since the latter buy and sell sides agree on the amount to pay/receive for the involves hidden state in the order book. We use a fixed en- trade, but this agreement depends on the different strategies vironment where transitions and rewards do not change in implemented by both sides. A trading strategy is by itself time, but the agent has the option to transition between “two a plan of actions designed to achieve profitable returns by different” state representations that, both combined, are the buying or selling financial assets (Pardo 2008). full state representation of the environment that simulates While trading strategies are meant to follow the well es- the manipulation process. tablished rules of the markets, some traders prefer to misbe- arXiv:1511.00740v1 [q-fin.TR] 2 Nov 2015 Our contribution is to show how these manipulative trad- have and take advantage of others by manipulating the price ing strategies can be modelled in a (PO)MDP framework of the assets being traded. For instance, some traders can ma- and how this reveals the causes of market manipulation in nipulate by spreading false information to other market par- terms of the incentives present in the market, and the dy- ticipants or by taking actions that may affect the perceived namics of how it operates. From this, we aim to examine price (Allen and Gale 1992), just as the case of the strat- two main questions: i) can spoofing and pinging modelled egy called pump and dump. Others, on the contrary, prefer by and MDP and POMDP respectively, be optimal strategies to take actions directly involved in the exchange of the as- when compared to honest behavior while seeking for growth sets by artificially inflating or deflating the price in order to maximisation? ii) If the manipulative strategies are optimal, obtain profits. Several manipulative strategies based on the which mechanisms can market regulators implement in or- trading process are well known in the financial argot and der to discourage or disincent traders taking such behaviour? Copyright c 2015, Association for the Advancement of Artificial The results of this yield recommendations to market regula- Intelligence (www.aaai.org). All rights reserved. tors as to how to stop manipulative behaviour. 2 Related Work the implementation of supervised learning algorithms (Cao Research on price manipulation has been done using several et al. 2014), or can be identified by modelling trading deci- approaches. Some authors have developed analytical models sions as MDPs and using Apprenticeship Learning to learn with the intention to investigate manipulative strategies per- the reward function (Yang et al. 2012). formed by large traders under the hypothesis of stochastic Though research is extensive in the area of market ma- economies with finite/infinite horizon and time dependent nipulation, few develop generative models of what encour- price processes (Jarrow 1992). Others take a continuous- ages these economic agents to follow the disruptive strate- time economy with risky and risk free assets and different gies. Furthermore, few of them provide recommendations agents involved in a game where predatory trading (trading to regulatory entities and/or firms (Rossi et al. 2015) to en- style that takes advantage of other investors’ needs) leads to courage traders to stop this harmful behaviour. Different to price overshooting and amplifies the selling cost and default the discriminative models that are intended to distinguish risk of large traders (Brunnermeier and Pederse 2005). Oth- the manipulative behaviour from other strategies, we use the ers consider the problem where manipulative uninformed (PO)MDP approach to model spoofing/pinging as it predicts traders can profit by selling a given firm’s stock, thus pro- the behavior of manipulators in terms of the market condi- viding a starting point to restrict short selling (when traders tions, thus providing a powerful tool that can be used by sell a security not owned) (Goldstein and Guembel 2008). market regulators to counter the manipulative strategies. Other researchers have focused in the application of data driven approaches with the aim to present empirical evi- 3 Problem Formulation dence of stock price manipulation under the assumption of 3.1 Trading in a Bull Market the presence of arbitrageurs or information seekers acting ra- In this work we are focused on modelling two trade-based tionally (Aggarwal and Wu 2006) or by finding unusual pat- market manipulation strategies as follows. Suppose there is terns of trading activities and systematic profitability based a trader managing an investment portfolio in behalf of a bro- on market timing and liquidity performed by brokers in kerage firm and has the objective to get high trading profits emerging markets (Khwaja and Mian 2005). An agency- that may produce portfolio growth in the short/medium term. based model is tested with empirical data where brokers ma- Suppose the agent is trading in a futures market and the port- nipulate the closing price to influence his customer’s percep- folio consists of two different contracts, α and β, with a mar- tion about his performance (Hillion and Suominen 2004). ket full of optimism so prices are rising (a situation known Also, behavioural stances have been mixed with theoret- as a bull market). Mathematically, the capital of the invest- ical and data driven approaches. An analytical framework ment account at given market tick t 2 [0;T ] (where a tick is developed that describes trade-based manipulation as an represents the execution of a new trade in the market, either intentional act to produce changes in the price and obtain a from the trader or any other participant) can be written as profit, so one could clarify what does and does not constitute manipulation (Ledgerwood and Carpenter 2012). Evidence It = at + ct; (1) of trade-based manipulation and its effects on investor be- α β haviour and market efficiency is provided, where the manip- where at = at + at is the capital associated to the mar- ulator pretends to act as an informative trader that may affect ket value of the contracts α and β, and ct is the cash to the reaction of other investors (Kong and Wang 2014). be used for future purchases of more contracts. The vari- Furthermore, discriminative models are intended to de- able at changes at every tick since the prices of the con- tect market manipulation based on empirical data. By us- tracts are following a trend, while ct changes due to cash ing economic and statistical analysis it is possible to detect inflows/outflows (by the sale/purchase of contracts). The net manipulation ex post, suggesting that the existence of reg- profit of the investment over a tick window [0;T ] is ulatory framework may be inefficient (Pirrong 2004). Ma- T chine learning techniques have also been applied for detec- X R = G − ζ ; (2) tion of manipulation.