vol. 20 no. 2 2019 tom 20 nr 2 2019

2019 vol. 20 no. 2 2019

2019 Editor-in-Chief of AGH University of Science and Technology Press: Jan Sas

Managerial Economics Periodical Editorial Board

Editor-in-Chief: Henryk Gurgul (AGH University of Science and Technology in Krakow, )

Associate Editors: Mariusz Kudełko (AGH University of Science and Technology in Krakow, Poland) economics, management Stefan Schleicher (University of Graz, Austria) quantitative methods in economics and management

Managing Editor: Joanna Duda (AGH University of Science and Technology in Krakow, Poland)

Technical Editor: Łukasz Lach (AGH University of Science and Technology in Krakow, Poland)

Editorial Board: Stefan Bojnec (University of Primorska, Koper, Slovenia) Gerrit Brösel (University of Hagen, Germany) Karl Farmer (University of Graz, Austria) Piotr Górski (AGH University of Science and Technology in Krakow, Poland) Ingo Klein (University of Erlangen-Nuremberg, Germany) Jessica Hastenteufel (IUBH University of Applied Sciences, Saarland University, Germany) Ulrike Leopold-Wildburger (University of Graz, Austria) Manfred Jürgen Matschke (University of Greifswald, Germany) Xenia Matschke (University of Trier, Germany) Roland Mestel (University of Graz, Austria) Stefan Palan (University of Graz, Austria) Peter Steiner (University of Graz, Austria) Paweł Zając (AGH University of Science and Technology, Poland)

Articles published in the semiannual Managerial Economics have been peer reviewed by re- viewers appointed by the Editorial Board. The procedure for reviewing articles is described at the following web address: http://www.managerial.zarz.agh.edu.pl

Language Editor: Aeddan Shaw Statistical Editor: Anna Barańska Editorial support: Magdalena Grzech Cover and title page design: Zofia Łucka DOI: https://doi.org/10.7494/manage © Wydawnictwa AGH, Kraków 2019, ISSN 1898-1143 (paper version), ISSN 2353-3617 (on-line version)

Number of copies 40. The printed version of the journal is the primary one.

Wydawnictwa AGH (AGH University of Science and Technology Press) al. Mickiewicza 30, 30-059 Kraków tel. 12 617 32 28, 12 636 40 38 e-mail: [email protected]; http://www.wydawnictwa.agh.edu.pl CONTENTS

Sylvia Endres Review of stochastic differential equations in statistical arbitrage pairs trading...... 71

Jessica Hastenteufel, Mareike Staub Current and future challenges of family businesses...... 119

Katarzyna Kowalska-Jarnot Selected methods of studying a college’s image...... 133

Jonas Rende Pairs trading with the persistence-based decomposition model...... 151

Summaries ...... 181

Instruction for authors...... 183

Double blind peer review procedure...... 187

Creating English-language versions of publications – an assignment financed by the Ministry of Science and Higher Education from resources allocated to science-propagating activities according to contract 541/P-DUN/2018

69

Managerial Economics 2019, vol. 20, No. 2, pp. 71–118 https://doi.org/10.7494/manage.2019.20.2.71

Sylvia Endres*

Review of stochastic differential equations in statistical arbitrage pairs trading

1. Introduction

Since the seminal studies of Thiele (1880), Bachelier (1900), Einstein (1905) and von Smoluchowski (1906), the use of stochastic differential equations in science, engineering and economics has expanded rapidly (Bodo et al. 1987, Sharp 1990). More recently, in statistical arbitrage pairs trading, interest in ad- vanced time-series modeling with stochastic differential equations has grown strongly, mainly due to increased activity on financial markets, the steady growth of computing power, and immense amounts of data at higher frequencies. The statistical arbitrage pairs trading strategy was introduced by Gatev et al. (1999) and Gatev et al. (2006) and consists of two time periods – formation and trading. In the formation period, pairs of strongly related stocks are formed by methods of time-series analysis. In the trading period, these pairs are monitored to detect any potential divergence in their price movements. If sufficiently large divergence occurs, the undervalued asset is bought and the overvalued asset is sold short, betting on a subsequent convergence of the two assets. By construction, the pairs trading strategy typically relies on the mean-revert- ing tendency of spreads1 and its reliable quantification. For this purpose, stochastic differential equations are utilized to model spread characteristics and explain their potential mean-reverting behavior. Specifically, stochastic spread modeling focuses on the “two main issues in implementing a pairs trading strategy” (Focardi et al. 2016): In the formation period, pairs are selected based on estimated model parameters, e.g., strength of mean-reversion, and as such, matching of pairs is

* University of Erlangen–Nürnberg, Department of Statistics and Econometrics, Lange Gasse 20, 90403 Nürnberg, Germany, e-mail: [email protected]. 1 Methods to measure the spread include the difference of (log-)prices and the difference of (cumu- lated) (log-)returns of two stocks. Only few studies use other measures of relative mispricing, e.g., the price ratio of two assets.

71 Sylvia Endres

clearly improved, rendering model-free metrics like the Euclidean distance un- necessary. In the trading period, investment decisions are determined based on predictions obtained from the stochastic spread model, replacing traditional rules of thumb by optimized signals. The majority of studies focuses on the trading period – in other words, it is assumed that pairs have already been selected and the goal is to derive optimal trading decisions. Hereby, two approaches are distinguished: In the first approach, optimized trading signals are derived using analytic frameworks with closed-form solutions. The second approach applies dynamic programming techniques to solve portfolio optimization problems (see Krauss 2017). The present survey follows this structure and hence organizes the works along two dimensions, i.e., (i) the stochastic model used to explain the spread and (ii) the trading optimization ap- proach (analytic vs. dynamic programming) based on the stochastic spread model. In total, the literature on statistical arbitrage pairs trading with stochastic differential equations consists of more than 80 studies. Across the wide range of stochastic models and their underlying mathematics as well as the diversity of statistical arbitrage frameworks, there remains lack of a holistic look at the research field from all the different angles. As of today, no academic work has yet consolidated and organized the extensive available knowledge. In the present paper, we fill this void by surveying the multitude of available references. Hereby, our contribution to research is threefold. First, we provide a comprehensive literature review that systematically categorizes the large body of work into five main strands of stochastic spread models: the Ornstein–Uhlenbeck model, extended Ornstein–Uhlenbeck models, advanced mean-reverting diffusion models, diffusion models with a non-stationary component, and other models. Table 1 provides an overview of these models, their corresponding stochastic differential equations, and representative studies per category. For each model, the approaches to derive optimized trading decisions are discussed thoroughly in consistent terminology and notation in sections 2 –6. The corresponding tables 2 – 6 summarize the main aspects per section. Second, we discuss the major papers in detail and assess the relative strengths and weak- nesses according to model and approach. Hereby, we provide valuable insights into the current state of research with its key fields and main limitations. Third, we reveal directions for further research and promising future studies within each category. The remainder of this study is structured as follows. Section 2 covers the Ornstein–Uhlenbeck model, section 3 extended Ornstein–Uhlenbeck models, section 4 advanced mean- reverting diffusion models. Diffusion models with a non-stationary component are reviewed in section 5 and other models in sec- tion 6. Section 7 concludes and summarizes the main results.

72 Review of stochastic differential equations in statistical arbitrage pairs trading

, 0 ), i ≥ t θ }

t , { L A , , Lévy process Lévy , 0 ≥ t } t Bertram (2009) Tie et al. (2017) Tie Liu et al. (2017) Bertram (2010a) Kim et al. (2008) Elliott et al. (2005) Bai and Wu (2018) Leung and Li (2016) Gregory et al. (2010) Representative study Ngo and Pham (2016) , model parameters ( θ and { Z and 0 0 ≥ Stübinger and Endres (2018) ≥ t t Avonleghi and Davison (2017) Alsayed and McGroarty (2013) } } t t ˜ no no no no no no log yes yes yes yes yes yes Norm Vola stoch. stoch. stoch. stoch. const. const. const. const. const. const. const. const. const. no no yes yes yes yes yes yes yes yes yes yes yes MR , Brownian motions (BM) { W (BM) motions Brownian , t 0 ≥ t t } t t = i t Table 1 Table t t t , R ) dW t t dW t t t dW ) dW t t s t dW t dW i + t ) dt + σ dW dt t ∈ N), Ornstein–Uhlenbeck (OU) process { L t − µ )) dt + σ dW t , stock price { S price stock , X t 0 ( r ) dt + σ ≥ X t ) dt + σ dW ) dt + Σ dW t ) dt + σ dW ) dt + σ dL ) dt + σ√ X ) dt + σ ( t , X ) dt + σ X + t t t t t t t } t 2 ) dt + σ ( t , X t − X i − X t ˜ () ( µ i qm α . The table indicates whether the model is mean-reverting (MR), whether the volatility =- = 1, ..., r

i = θ ( µ − X = A ( B − X = θ ( µ − X = θ = θ ( µ − X = θ ( µ − X = θ ( µ − X = θ ( α t , Zt ) − X = θ ( L = − tanh( c ( X t t t t t t t t t t t Stochastic differential equation dX dX dX dSt = µ Stdt + σ StdW dX dX dX dX dX dX dXt = µ ( t , X dX dX , regimes and function t ≥ 0 } t ), c i σ

) (Vola) is constant or stochastic, and whether the underlying process is normally distributed (Norm) ) (Vola) is constant or stochastic, and whether the underlying , t Σ

, σ Model ( σ ( t , X ), i µ

, B , Stochastic volatility Inhom. geometric BM Non-stationary diffusion Geometric BM Other Doubly mean-reverting Ornstein–Uhlenbeck Extended Ornstein–Uhlenbeck Multivariate Regime-switching Lévy driven Two-factor General Itô diffusion Nonlinear mean-reverting Skew mean-reverting Advanced mean-reverting diffusion Cox-Ingersoll-Ross µ ( Markov process { R Overview of stochastic models for spread { X spread for models stochastic of Overview

73 Sylvia Endres

2. Ornstein–Uhlenbeck model

The Ornstein–Uhlenbeck (OU) model in pairs trading was proposed by El- liott et al. (2005). The authors laid the foundation2 for prediction and decision making based on this model (Zeng and Lee 2014). Nowadays, the majority of pairs trading studies use the OU model to explain the spread dynamics. For stock prices S1(t) and S2(t), the spread Xt is typically calculated by Xt = S1(t) − S2(t) or 3 Xt = ln S1(t) − ln S2(t) and follows

dXt = θ(µ − Xt)dt + σdWt (1) with mean-reversion level (i.e., long-term mean, equilibrium level) µ, mean- reversion rate or speed θ, volatility σ, and Brownian motion {Wt}t≥0. The mean- reversion level µ is a key element of the pairs trading strategy – the spread reverts to this level at rate θ and strategies usually take advantage of this behavior. The half-life h measures the time taken by the process to move halfway back to its equilibrium after divergence and is calculated by ln2 h = (2) q The conditional distribution of the OU process is

2  --qqtts 2  (|XXt 0 =+xx)(  mm--),ee()1   2q  and in the limit  s2  XN m,  (3)  2q 

Transforming the OU process (1) into a dimensionless system via

Zt = (Xt − µ)√2θ/σ (see, e.g., Zeng and Lee 2014) yields the dimensionless OU process

dZt = −Ztdt + √‾2dWt (4)

2 As of 2018-09-16, there are more than 285 citations on Google Scholar for Elliott et al. (2005). 3 According to Do et al. (2006), the spread should be defined by Xt = ln(S1(t)) ln(S2(t)) to remove the implicitly assumed restriction of “return parity”. Assume that, in one unit of time, stocks 1

and 2 both return r (r ∈ ). Then, ln(S1(t + 1)) − ln(S2(t + 1)) = (ln(S1(t))+ r) − (ln(S2(t))+ r) = = ln(S1(t)) − ln(S2(t)).

74 Review of stochastic differential equations in statistical arbitrage pairs trading

with long-term mean zero. The first-passage time density in the dimensionless system is explicitly known (see, e.g., Göncü and Akyildirim 2016a)

2 ce-t  ce22- t  ft()= exp - , t > 0 (5) 0,c -23t /2  -2t  p ()12- e  ()1 - e 

Hereby, f0,c is the density of the first-passage time

τ0,c = inf{t ≥ 0, Zt = 0|Z0 = c} which is the time until Zt reaches its mean-reversion level 0 when starting in c. In the original OU process (1), the level c corresponds to µ + c(σ/√2θ) (see Göncü and Akyildirim 2016a). Pairs trading decisions from the calibrated OU model can be optimized in the formation and trading period. In the formation period, the OU model parameters can be used for pairs selection. Favorable pairs are, e.g., (i) pairs with high mean- reversion speed θ or low half-life h (see equation 2) because they revert back to their equilibrium level fast4, (ii) pairs with high volatility σ or high equilibrium standard deviation σeq (see equation 3) because they create many trading oppor- tunities. In the trading period, entry and exit decisions are optimized based on the calibrated OU model. Hereby, the basis is a rule of thumb (‘two- sigma rule’) explained by Gatev et al. (2006) – pairs are opened when the spread deviates by more than two historical standard deviations st from its moving average µt, i.e., the spread crosses µt ± 2st, and closed when it reverts back to µt. The studies surveyed in this subsection modify or replace this rule of thumb by advanced sig- nals obtained from the OU model. We discuss the three approaches of literature to optimize trading decisions: analytic (subsection 2.1), dynamic programming (subsection 2.2), and other (subsection 2.3). Table 2 provides an overview of the relevant works which are discussed in the following part.

Table 2 Ornstein–Uhlenbeck model

Approach Study 2.1. Analytic approach 2.1.1. Baseline approach Elliott et al. (2005) Generalizations of the baseline Do et al. (2006), Triantafyllopoulos and Montana approach (2011) Further analytic results Rampertshammer (2007)

4 High mean-reversion speed or low half-life are equivalent metrics for pairs selection.

75 Sylvia Endres

Table 2 cont.

Approach Study Baronyan et al. (2010), Dunis et al. (2010), Bogomolov (2011), Kim (2011), Nobrega and Applications of the baseline Oliveira (2013), Diamond (2014), Fanelli and approach Lesca (2014), Nobrega and Oliveira (2014), Kang and Leung (2017), Yang et al. (2017), Blázquez and Román (2018), Psaradellis et al. (2018) Kanamura et al. (2010), Bogomolov (2013), Further approaches Temnov (2015) 2.1.2. Optimal trading thresholds Bertram (2010b) Empirical applications of Bertram Cummins (2010), Bucca and Cummins (2011), (2010b) Cummins and Bucca (2012) Further enhancements of Bertram Zeng and Lee (2014), Göncü and Akyildirim (2010b) (2016b), Baviera and Baldi (2018) 2.2. Dynamic programming 2.2.1. Optimal investment Boguslavsky and Boguslavskaya (2004), Jurek and allocation Yang (2007), Mudchanatongsuk et al. (2008) Extension for time-dependency Charalambous et al. (2015) Liu and Timmermann (2013), Tourin and Yan Continuous-time cointegration (2013) Figuerola-Ferretti et al. (2015), Angoshtari Extending continuous-time (2016), Li and Tourin (2016), Figuerola-Ferretti cointegration et al. (2017) 2.2.2. Optimal timing of trades Zhang and Zhang (2008) Ekström et al. (2011), Song and Zhang (2013), Incorporating stop-loss limits Lindberg (2014), Kuo et al. (2015), Leung and Li (2015), Li (2015), Leung and Li (2016) Further enhancements on optimal Dourban and Yedidsion (2015), Lei and Xu timing: Finite horizon, multiple (2015), Suzuki (2016), Kitapbayev and Leung regimes, cointegration, model (2017), Yoshikawa (2017), Kitapbayev and Leung uncertainty (2018), Suzuki (2018) 2.3. Other approaches Avellaneda and Lee (2010), Yeo and Papanicolaou Principal component analysis (2017), Burks et al. (2018) Relativistic statistical arbitrage Wissner-Gross and Freer (2010) Penalized likelihood approach Zhang et al. (2018)

76 Review of stochastic differential equations in statistical arbitrage pairs trading

2.1. Analytic approach

The key idea of the analytic approach is to obtain optimized trading thresh- olds using closed form solutions.

2.1.1. Baseline approach Elliott et al. (2005) were the first authors to apply the OU model to explain the spread Xt = S1(t) −S2(t). A pairs trade is opened when the dimensionless OU process Zt (see equation 4) crosses a threshold c > 0, i.e., reaches an extreme value. To determine the optimal exit timing, the authors suggest to use the first-passage time τ0,c, since the corresponding probability density function f0,c is ˜ explicitly known. The trade is exited at a fixed time t , where f0,c(t) has a maxi- mum value and thus Zt reaches its long-term mean 0 with greatest probability. For the original OU process (see equation 1), this means that a trade is entered 1  when Xt crosses µ ± c(σ/√2θ) and exited Tt= q times later – at the most likely time at which Xt reaches its mean-reversion level µ. However, the threshold c is not further specified by the authors. Further, Cummins and Bucca (2012) point out that the lengths of the trade cycles are uncertain and not directly considered in the strategy. Do et al. (2006) generalize the approach of Elliott et al. (2005) and model the spread Xt as the difference of returns instead of prices. They additionally include a loading matrix Γ and an exogenous vector Ut for modeling the equilibrium. The modeling framework is similar to Elliott et al. (2005) and the dynamics of Xt are governed by an OU process. A pairs trade is opened when the cumulated spread crosses a certain threshold and closed when it reaches the long run level of the spread. However, no explicit threshold for market entry is given. Triantafyllopou- los and Montana (2011) extend the state-space framework of Elliott et al. (2005) for time-dependent model parameters, thus gaining flexibility and being able to quickly adapt to changes in the data. Further analytic trading frameworks based on the OU model are discussed by Rampertshammer (2007). In subsequent years, various studies optimize the strategy with the calibrated OU model based on the framework of Elliott et al. (2005). In the formation pe- riod, Dunis et al. (2010), Kim (2011) and Blázquez and Román (2018) select pairs based on estimated OU model parameters, e.g., mean-reversion speed θ, instead of simple distance metrics. In the trading period, the traditional ‘two-sigma rule’ by Gatev et al. (2006) is optimized by using the OU model parameters µ and σ instead of moving average µt and historical standard deviation st. Baronyan et al. (2010), Bogomolov (2011), Diamond (2014), and Fanelli and Lesca (2014) open positions when the spread reaches an exteme value µ ± k1σ and exit them when the spread reverts back to µ ± k2σ, choosing various values for k1 and k2. Other

77 Sylvia Endres

trading rules based on OU model parameters are constructed in Nobrega and Oliveira (2013), Nobrega and Oliveira (2014), Kang and Leung (2017), Yang et al. (2017), and Psaradellis et al. (2018). Apparently, there exist various different ap- proaches for trading, which raise the question whether there is an optimal and completely model-driven rule. Further analytic approaches are provided by Kanamura et al. (2010), Bogomo- lov (2013), and Temnov (2015). Kanamura et al. (2010) introduce a profit model for spread trading, taking advantage of the explicit first-passage time density of the OU process. Bogomolov (2013) proposes a nonparametric pairs trading ap- proach and demonstrates its theoretical profitability for the OU process. Temnov (2015) develops a trading strategy based on an explicit formula for the running maximum of an OU process stopped at its maximum drawdown.

2.1.2. Focusing on the role of time – optimal trading thresholds Analytic formulas for optimal trading are finally provided by Bertram (2010b). The spread is modeled by the OU process (1) with mean-reversion level µ = 05.

A trade is entered at Xt = a and exited at Xt = m for a < m. A trade cycle is com- pleted when the spread has reverted back to the starting value a. The time the spread needs to undergo this cycle is the total trade length τ = τm,a + τa,m with mean E[τ], variance Var[τ], and τm,a = inf{t ≥ 0|Xt ≥ m, X0 = a}. An optimal strat- egy is derived by maximizing two different objective functions – the expected return per unit of time ra(,mc,) m(,am,)c = (6) E[]t and the expected Sharpe ratio per unit of time: r m(,am,)c - f E[]t Sa(,mc,,r ) = (7) f E[]t for return r(a, m, c) = m − a − c, transaction costs c, risk-free rate rf , and return variance

σ2(a, m, c) = r(a, m, c)2 Var[1/τ] = r(a, m, c)2 Var[τ]/E[τ]3 (8)

The distribution of the first-passage time is known (see equation 5) and thus E[τ] and Var[τ] can be calculated explicitly.

5 Following Cummins and Bucca (2012), the zero-mean assumption is no issue in practice since the analytic results can be translated easily to a non-zero mean.

78 Review of stochastic differential equations in statistical arbitrage pairs trading

As such, analytic formulas for µ(a, m, c) and S(a, m, c, rf ) can be derived q()ma--c m(,am,)c = (9)   maq   q  p Erfi - Erfi         s   s  and S(a, m, c, rf ) similarly. Erfi(x) denotes the imaginary error function. The optimal analytic trading thresholds a and m are received by maximizing these functions via differentiation. For the case of maximizing the expected return per unit of time, the following entry bound a is obtained cc2q a =- - - 4 13/ 42()cc33qq+-4422sq33cc45sq22+ 6 44s 13/ ()cc33qq+ 24 22s --+43cc45qs2236 qs44

4q The optimal trading bounds are found to be symmetric around the mean- reversion level of the process, i.e., m = −a. The approach by Bertram (2010b) has three advantages: First, the trading thresholds are optimal and model-driven – classic rules of thumb are rendered unnecessary. Second, closed-form solutions allow for analytic investigations and computationally efficient frameworks even in the high-frequency context (see Krauss 2017). Third, the approach allows for consistent cross-comparison, i.e., different strategies can be compared since their deterministic returns are all normalized by the expected trade cycle time (see Cummins and Bucca 2012). For future research, Bertram (2010b) suggests to apply the method to non-Gaussian processes, e.g., processes driven by Lévy noise, although it is not clear whether comparable analytic results exist. Bucca and Cummins (2011) suggest to consider regime shifts in the long-run mean level of the spread series. The surveyed studies in subsections 3.2 and 3.3 focus on the latter two issues. Zeng and Lee (2014) criticize that fixed trading thresholds are not applicable in the long-run and propose to investigate time-dependent thresholds. Empirical applications of the approach by Bertram (2010b) are found in Cummins (2010), Bucca and Cummins (2011), and Cummins and Bucca (2012). Cummins (2010) performs a comprehensive analysis on Irish stock exchange data. Bucca and Cummins (2011) conduct a model specification analysis on Brent and TD3 data. Cummins and Bucca (2012) provide a large application on oil based markets and control for data snooping bias. Further enhancements of the approach by Bertram (2010b) are provided by Zeng and Lee (2014), Göncü and Akyildirim (2016b), and Baviera and Baldi (2018).

79 Sylvia Endres

Zeng and Lee (2014) derive optimal analytic trading thresholds for an OU process, considering first-passage times over a two-sided symmetric boundary. Göncü and Akyildirim (2016b) derive optimal trading bounds for another objective function than Bertram (2010b) – the probability of successful termination of the strategy. Baviera and Baldi (2018) extend the optimal strategy for two key elements of high-frequency trading – stop-loss and leverage.

2.2. Dynamic programming Based on the OU model to explain the spread dynamics, another stream of literature focuses on dynamic programming techniques. The key idea of the dynamic programming approach is to solve an arbitrageur’s dynamic portfolio optimization problem based on stochastic control theory. An investor with given 6 preference specification can either trade a risky arbitrage opportunity Xt or allocate capital to a risk-free asset Mt. The arbitrage opportunity, i.e., the mean-reverting spread Xt, is modeled by a stochastic differential equation. 2.2.1. Optimal investment allocation A different angle to look at pairs trading is the problem of determining op- timal portfolio holdings over time. The most cited paper7 in this domain is Mud- chanatongsuk et al. (2008). The authors assume that an investor either allocates capital to a risk-free asset Mt with rate r or to a stock pair with prices S1(t) and

S2(t) and spread Xt, governed by the OU process. The portfolio weights of stocks

1 and 2 are denoted by h1(t) and h2(t) and it is required that h1(t) = −h2(t), i.e., the pairs trading portfolio is dollar-neutral. A negative weight represents a short position in the respective stock. The portfolio value Vt follows

 dS1()t dS2()t dMt  dVtt=+Vh 1()t ht2() +   St1() St2() Mt  The objective is to maximize the expected utility from this portfolio at the final timeT . For an investor with power utility, the optimization problem has the following form:

 1 g  sup EV T  ht1()  g 

6 1 γ Different utility functions for an investor with risk aversion γ are considered: Power utility u(x) = γ x , exponential utility u(x) = e−γx, logarithmic utility u(x) = ln(x), constant relative risk aversion (CRRA) utility, constant absolute risk aversion (CARA) utility, and Epstein–Zin utility. 7 As of 2018-09-16, there are 85 citations on Google Scholar for Mudchanatongsuk et al. (2008).

80 Review of stochastic differential equations in statistical arbitrage pairs trading

subject to: V(0) = v0, X(0) = x0

dXt = θ(µ − Xt)dt + σdWt (10)

 1 2   dVtt=-Vh 1()tX((qm tt))++srhs + rd td+ s W   2   with volatility η of the dynamics of stock 2 and correlation ρ. The stochastic con- trol problem (10) is solved using dynamic programming techniques – the optimal * weight h1(t) is obtained via the Hamilton-Jacobi-Bellman (HJB) equations as

* 1  qm()x - rh 1  ht1(,xt)(= ba)(+-2xt) ++ (11) 1 - g  s2 s 2  with functions α(t) and β(t). The work of Mudchanatongsuk et al. (2008) lays the foundation for pairs trading based on stochastic control (see Lintilhac and Tourin 2017). The major contribution are optimal closed-form solutions for the portfolio holdings. However, there are two downsides associated with this ap- proach, which particularly impact practical applications. First, transaction costs are not considered in the study. Second, the optimal strategy requires infinitesi- mal rebalance (see Suzuki 2016), i.e., positions have to be adjusted constantly according to equation (11). Jurek and Yang (2007) consider a similar setting as Mudchanatongsuk et al. (2008) and focus on an investor with constant relative risk aversion (CRRA) and Epstein–Zin utility. For CRRA utility and γ = 1, they derive the optimal portfolio allocation by

*  qm()xr- x  hV(,ttx) =- - V (12)  ss22 Boguslavsky and Boguslavskaya (2004) solve the optimal investment problem for an arbitrageur with power utility and a single risky asset that follows an OU process, i.e., the risk-free rate r is assumed to be zero. Charalambous et al. (2015) provide an extension of Mudchanatongsuk et al. (2008) for time-dependent model parameters. A common characteristic of the aforementioned studies is that they assume dollar-neutrality, i.e., the investor goes long one stock and short the other in equal dollar amounts. In recent years, some studies have dropped the dollar-neutrality assumption of classic pairs trading. By modeling two cointegrated assets in a contin- uous-time setting, the amounts invested in each position of the pairs trade are de- termined separately. The most cited paper8 in this domain is Tourin and Yan (2013).

8 As of 2018-09-16, there are more than 40 citations on Google Scholar for Tourin and Yan (2013).

81 Sylvia Endres

The authors model the prices S1(t) and S2(t) of two cointegrated stocks for t ∈ [0, T] by

2  s1  dSln 11()tz=-m + dsttdd+ 11Wt()  2  2  s2  dSln 22()td=-m  t + s22dW ()t (13)  2  with co-integrating vector zt:

zt = α + ln S1(t) + β ln S2(t)

They show that zt follows the OU process

 s2   s2  dz =-m 1 + dszd++dW ()tdbm- 2 + bs dW ()t) tt 1 22 tt11  2  22     =-qm()zdtttd+ s W

2 2 222 with θ = −δ, µ = −1/δ (µ1 − σ1 /2 + β(µ2 − σ2 /2)), and ss=+1 bs1 . The investor’s portfolio Vs consists of stocks 1 and 2 with weights h1(s) and h2(s) for s ∈ [t, T ] and a risk-free asset, for which the interest rate is set to 0. The optimal strategy is specified by the following stochastic control problem

sup[Eu()VT ] (14) (,hh12)

The authors use exponential utility u with risk aversion coefficientγ > 0 since their ansatz does not work for power utility. Via the HJB equations, the optimal * * solutions (h1, h2) for problem (14) are derived as:

md+ zzmd+ 1 ds2()2 + bs2 hs* 1 1 ()Tt 1 2 ()Tt2 11= 2 - d 2 -+ 2 - gs1 gs1 4 gs1

m md+ z 1 ds2()2 + bs2 hs* =-2 bd 1 ()Tt-+b 1 2 ()Tt- 2 22 2 2 2 gs2 gs1 4 gs1

As such, the optimal holdings at any time t depend on the stock prices and the wealth at that time. Subsequently, the authors present an extension that in- corporates correlations between the two stocks and derive the optimal control pair. The simplicity of the aforementioned formulas allows for a straightforward implementation of the optimal strategy, while the limitations of this approach are pointed out by Zhengqin (2014). First, the analytic solutions only hold if the

82 Review of stochastic differential equations in statistical arbitrage pairs trading

risk-free rate of return is zero. Second, the ansatz is not applicable for other types of utility functions. Third, in real world markets where transaction costs exist, it is not possible to adjust the stock positions constantly to take the optimal holdings. Further investigation of the optimal investment problem for cointegrated assets under power utility is found in Liu and Timmermann (2013). Li and Tourin (2016) extend the model by Tourin and Yan (2013) for time- varying volatility, replacing the constant volatility coefficients in equation (13) θ1x θ2x by σ1(t, x) = σ1e and σ2(t, x) = σ2e . Further, they use power utility instead of exponential utility. However for this framework, the authors are unable to derive a fully explicit solution for the stochastic control problem. Figuerola–Ferretti et al. (2015) and Figuerola–Ferretti et al. (2017) extend upon Liu and Timmer- mann (2013), maximizing the portfolio value without explicit specification of the investor’s preferences in an utility function. Instead, they identify a replicat- ing portfolio based on option valuation and connect it to the optimal strategy. Angoshtari (2016) rests upon the work of Liu and Timmermann (2013), deriving a theoretical justification for the practice of market-neutral pairs trading based on two cointegrated assets.

2.2.2. Optimal timing of trades Besides the problem of optimal investment allocation, another stream of studies places focus on the strategy’s optimal timing, determined with dynamic programming techniques. The baseline approach by Zhang and Zhang (2008) aims at maximizing a discounted reward function by sequentially buying and selling a mean-reverting asset Xt = ln St following the OU process. A position is bought at τ1, sold at σ1, bought again at τ2, etc., leading to the following sequence of stopping times:

0 ≤ τ1 ≤ σ1 ≤ τ2 ≤ σ2 ≤ ... . (15)

If the initial net positions is flat (i = 0), the decision sequence is

Λ0 = (τ1, σ1, τ2, σ2, ...), for an initial long position (i = 1), it is Λ1 = (σ1, τ2, σ2, τ3, ...). For discount factor ρ > 0 and slippage 0 < K < 1, the authors maximize the dis- counted reward function

Vxii()= sup(Jx,)Li (16) Li for  ∞  Jx(,L )(=-Ee --rsn SK11)(-+eSrt n K ) (17) 00 ∑ sn t n   n=1 

83 Sylvia Endres

and J1(x, Λ1) similarly. The authors show that a threshold pair (x1, x2), obtained by solving two quasi-algebraic equations, leads to the optimal stopping times. The low threshold corresponds to the buy point, the high threshold to the sell point. Zhang and Zhang (2008) lay the foundation for optimal timing strategies based on stochastic control, allowing infinitesimal sequential buying and selling subject to slippage cost. However, buying and selling of positions at exactly the same time (see equation 15) is not possible in real world markets. Moreover, the authors only allow for two regimes, i.e., buy and sell, and the approach could be extended for a third regime, i.e., short the position (see Ngo and Pham 2016). Integration of stop-loss thresholds into the optimal timing strategy can be found in various studies, i.e., Ekström et al. (2011), Song and Zhang (2013), Lindberg (2014), Kuo et al. (2015), Leung and Li (2015), Li (2015), and Leung and Li (2016). In recent years, some studies present further enhancements to optimal tim- ing strategies. Dourban and Yedidsion (2015), Kitapbayev and Leung (2017) and Kitapbayev and Leung (2018) study optimal stopping problems over finite horizons. However, they are not able to derive closed-form solutions. Suzuki (2016) and Suzuki (2018) focus on optimal switching over multiple regimes: no holding of stocks, long in the first stock and short in the second, and vice-versa. Lei and Xu (2015) study the optimal timing problem for two co-integrated assets whose co- integrating vector follows the OU process. Yoshikawa (2017) incorporate model uncertainty into the optimal boundary framework with the use of relative entropy.

2.3. Other approaches

This subsection summarizes other approaches based on the OU model with limited relation to the aforementioned references, covering principal component analysis, relativistic statistical arbitrage, and the penalized likelihood approach.

Principal component analysis. Avellaneda and Lee (2010) model the spread Xt, defined by

St1() St2() X t =-ln baln - t S1()00S2() as an OU process. Based on this model, they generate trading signals for mean- reverting portfolios with the aid of Principal Components Analysis (PCA) and exchange-traded funds. Motivated by this study, Yeo and Papanicolaou (2017) analyze the risk due to mis-estimation of mean-reversion based on Principal Com- ponents. Burks et al. (2018) use the framework of Avellaneda and Lee (2010) and analyze how systemic illiquidity affects mean-reverting trading strategies.

84 Review of stochastic differential equations in statistical arbitrage pairs trading

Relativistic statistical arbitrage. Wissner-Gross and Freer (2010) describe the cointegrating linear combination between two correlated assets by the OU process and develop a relativistic generalization of statistical arbitrage trading strategies. Penalized likelihood approach. Zhang et al. (2018) construct mean-reverting portfolios that follow an OU process via penalized likelihood estimation. Simul- taneously, portfolios with desirable characteristics, i.e., high mean-reversion and low variance, are formed. The nonconvex optimization problem is solved by an algorithm based on partial minimization. Interestingly, the approach allows si- multaneous portfolio selection and OU model calibration in one step. However, the study lacks specific trading rules. Despite the obvious advantages of the OU model – simplicity, analytic trac- tability and the ability to explain the important mean-reverting property – it has several deficiencies and is not able to completely describe the reality of the spread process (see, e.g., Avonleghi and Davison 2017). The first downside is the constant volatility assumption. According to Pilipovic (2007) and Avonleghi and Davison (2017), eventful market news sometimes influence prices strongly and lead to increased volatility. The second downside is the Gaussian nature of the OU process. According to various studies (see, e.g., Bertram 2010b), financial data displays non-Gaussian behavior. The third downside is the constant equilibrium level of the OU process. In practice, a stochastic or at least time-dependent mean level is more realistic (see, e.g., Liu et al. 2017). Some of these disadvantages are compensated by extensions of the classic OU process, i.e., multivariate, Lévy driven, and regime-switching OU models.

3. Extended Ornstein–Uhlenbeck models

In recent years, extended Ornstein–Uhlenbeck models have been used to account for stylized facts of financial return series, e.g., correlations, fat tails, and regime-switches. The relevant works are summarized in Table 3.

3.1. Multivariate Ornstein–Uhlenbeck model The assumption that there exists only one mean-reverting asset is very restrictive since a robust spread construction methodology could be used for various related spreads (see Jurek and Yang 2007). Combining multiple assets assures greater diversification (Jurek and Yang 2007, Kim et al. 2008) and allows to exploit co-moving patterns (Liu et al. 2017) and common interactions (Endres and Stübinger 2019b) between spreads. In this respect, multivariate OU models are used to construct portfolios of multiple correlated spreads. We consider this framework under the multivariate pairs trading umbrella.

85 Sylvia Endres

Table 3 Extended Ornstein–Uhlenbeck models

Model Study 3.1. Multivariate Ornstein–Uhlenbeck model 3.1.1. Analytic approach Optimal trading thresholds: Decisions Rampertshammer (2007) based on first-passage times 3.1.2. Dynamic programming Optimal investment allocation: Optimal Kim et al. (2008), Chiu and Wong (2011), portfolio holdings, continuous-time Chiu and Wong (2015), Lintilhac and cointegration, enhancements for time- Tourin (2017), Yamamoto and Hibiki consistency and robustness (2017), Chiu and Wong (2018) 3.2. Lévy driven Ornstein–Uhlenbeck model 3.2.1. Analytic approach Model-driven decisions: Optimizing for- Stübinger and Endres (2018) mation and trading period Optimal trading thresholds: Decisions Göncü and Akyildirim (2016a), Endres based on first-passage time results and Stübinger (2019b) 3.2.2. Dynamic programming Optimal timing of trades: Optimal liquida- Larsson et al. (2013) tion, stop-loss thresholds 3.3. Regime-switching Ornstein–Uhlenbeck model 3.3.1. Analytic approach Optimal trading thresholds: Maximum Bai and Wu (2018) expected return per unit of time 3.3.2. Dynamic programming Optimal investment allocation: Optimal Altay et al. (2017) weights, logarithmic utility 3.3.3. Other approaches Model-based trading rules: Two-state Yang et al. (2016) model, different trading strategies Regime classification algorithm: Flexible Endres and Stübinger (2019a) number of regimes

86 Review of stochastic differential equations in statistical arbitrage pairs trading

The multivariate spread Xt = (X1, ..., XN ) consists of N correlated spreads and follows

dXt = A(B − Xt)dt + ΣdWt (18)

N×N N with A ∈ R , B ∈ R , vector {Wt}t≥0 of independent univariate standard Brown- ian motions, and positive definite covariance matrix

 ss11 1N    S =      ssNN1 N 

Table 3 provides an overview of the relevant works applying a multivariate OU model to construct portfolios of multiple correlated spreads. As in the previ- ous section, we first discuss the analytic approach before dynamic programming techniques are presented.

3.1.1. Optimal trading thresholds – analytic approach Rampertshammer (2007) provides analytic results on the OU model and cor- responding first-passage times in a multivariate framework. The author models the spreads by equation (18) and chooses B ≡ 0, so the process reverts around 0 9. Trades are opened when spreads diverge sufficiently large from their historical equilibrium, i.e., cross thresholds a or m, a < m, and closed when their equilib- rium is re-established, i.e., at time τ = inf{t ≥ 0|Xt = 0}. To determine the optimal thresholds a and m, a first-passage time problem needs to be solved. However, the author points out that it is not possible to solve the problem for the multivariate process analytically and suggests to use numerical methods. The only way would be to assume independence between the different spreads, but then the problem reduces to the modeling of separate univariate OU processes. Based on the results of Rampertshammer (2007), research should further investigate the problem of determining explicit optimal trading thresholds in the multivariate framework. Moreover, the extent of correlations among spreads in real world data should be analyzed to identify the practical relevance of multivariate modeling compared to the much simpler univariate approach.

3.1.2. Optimal investment allocation – dynamic programming Kim et al. (2008) consider a dynamic trading strategy that allocates capital over a portfolio Vt of multiple correlated spreads Xt = (X1, ..., XN ) and a risk-free asset.

9 The zero-mean assumption is no issue since the results could also be translated to a non-zero mean.

87 Sylvia Endres

The number of units of the spreads at time t is denoted by h(t) = (h1(t), ..., hN (t)).

Assuming Xt follows the multivariate OU model (18), the problem of determin- ing optimal portfolio holdings over a finite horizon for CRRA and CARA utilities is investigated:

(19) sup(Eu VXTt)(tx),==Vt() vt  h

The authors show that problem (19) can be expressed in terms of ordinary differential equations that need to be solved numerically. Kim et al. (2008) pro- vide the first study investigating the stochastic control problem in a multivariate framework. They contribute to existing literature by providing computationally tractable solutions for the optimal portfolio holdings. However, transaction costs are not included in their problem formulation. For futher research, Kim et al. (2008) raise an interesting question – it should be evaluated whether the benefit from diversification across various spreads compensates losses due to unfavorable shocks that destabilize complete multivariate portfolios. Further investigation of optimal pairs trading with multiple pairs based on a mathematical programming approach can be found in Yamamoto and Hibiki (2017). Chiu and Wong (2011) focus on optimal multivariate investment allocation over cointegrated assets, providing the first theoretical work in that context. They consider one risk-free asset and n stocks, from which N (1 ≤ N ≤ n) cointegrating vectors zt = {z1,t, ..., zN,t} are constructed:

n ztcSln ,,jN1 ..., jt,,=+abjj+=∑ ij it i=1 where Si,t is the price of asset i at time t for i = 1, ..., n. The cointegrating vectors zt follow the multivariate OU process of equation (18). The optimal portfolio weights are determined by solving the following problem:

min[Var VT ] h (20) subjectto EV[]TT= v for portfolio weights h, terminal wealth VT, and expected final wealthv T. The au- thors provide closed-form solutions of problem (20). The approach by Chiu and Wong (2011) generalizes classic pairs trading in two respects. First, the cointegrat- ing vector consists of n risky assets instead of two. Second, the dollar-neutrality assumption, i.e., the assumption that the investor trades equal amounts in the long and short positions, is dropped since the coefficients cij are determined

88 Review of stochastic differential equations in statistical arbitrage pairs trading

separately for each stock. Further research should provide an empirical appli- cation with multiple spreads based on the multivariate theoretical framework. Lintilhac and Tourin (2017) consider a similar framework as Chiu and Wong (2011). Main difference lies in the objective function: Instead of minimizing the variance of terminal wealth as in (20), they maximize the expected utility from terminal wealth for an investor with exponential utility. Analytic results for the optimal portfolio weights are provided. Chiu and Wong (2015) expand upon Chiu and Wong (2011) and investigate time-consistent portfolio selection with cointegration, providing an optimal time- consistent asset allocation policy. Chiu and Wong (2018) study robust dynamic pairs trading with cointegration subject to parameter estimation errors. To summarize the subsection on multivariate OU models, it should be pointed out that literature in this context is sparse due to analytic tractability issues (see Jurek and Yang 2007)10. Given the high practical relevance of multivariate model- ing (see Meucci 2009), future research should aim at providing further analytic results in that context.

3.2. Lévy driven Ornstein–Uhlenbeck model The normality assumption of the classic OU process is an obvious deficit since stock prices and returns exhibit fat tails and jumps (see, e.g., Cont 2001, Bertram 2009, Göncü and Akyildirim 2016a, Stübinger and Endres 2018). Replacing the

Brownian motion {Wt}t≥0 by a Lévy-process {Lt}t≥0 leads a more flexible process:

dXt = θ(µ − Xt)dt + σdLt. (21)

The model (21) includes as special cases (i) the classic OU process driven by

Brownian motion, and (ii) a jump-diffusion model, in case the Lévy process {Lt}t≥0 consists of Brownian diffusion and jumps caused by a Poisson process {Nt}t≥0 with intensity λ ≥ 0. Different variants of jump-diffusion models are applied in literature, e.g.,

dXt = θ(µ(t) − Xt)dt + σdWt + ln JtdNt (22)

1 0 There are some studies (see, e.g., D’Aspremont 2011 and Cartea and Jaimungal 2016) that construct

one mean-reverting portfolio Pt consisting of multiple stocks S1, ..., Sn. The mispricing in this case n is of the form with coefficients x , i = 1, ..., n. These works are not covered by Pxti= Sti () i ∑ i=1 our definition of a multivariate framework since the underlying OU process for portfolio Pt is still univariate.

89 Sylvia Endres

with a random variable Jt modeling the jump sizes or

dXt = −θXtdt + σdWt + dCt (23)

Nt with compound Poisson process CJtk= and a sequence of independent k=1 ∞ ∑ random variables {}Jkk=1 with common symmetric density ϕ. Table 3 summarizes the relevant works applying a Lévy driven OU model to capture the spread dynamics. To the present day, there exist only four academic studies in this context.

3.2.1. Analytic approach Model-driven decisions. Stübinger and Endres (2018) develop a high-fre- quency pairs trading framework based on a mean-reverting jump-diffusion model.

The spread Xt follows the jump-diffusion model (22) with varying jump intensity λ(t) such that

0if the observation is intraday l()t =  l otherwise (overnight, weekend)

The authors verify the existence of overnight jumps in their empirical data set and therefore consider the jump component ln Jt dNt in their model. In the formation period, pairs selection is optimized by choosing pairs with highest mean-reversion rate θ and highest jump intensity λ. A large mean-reversion rate ensures fast convergence to the equilibrium level µ(t), where pairs trading profits are taken. A high jump intensity creates sudden, large spread movements and thus many trading opportunities. In the trading period, thresholds vary around the mean-reversion level µ(t) of the process. The main contributions of this study are (i) the integration of a jump component into the classic mean-reverting model and (ii) the consideration of the model’s mean-reverting patterns and jump behavior in both the formation and the trading period. Optimal trading thresholds. Göncü and Akyildirim (2016a) as well as En- dres and Stübinger (2019b) determine optimal trading thresholds in the sense of Bertram (2010b), hereby explaining the spread dynamics by the Lévy driven OU model of equation (21). Göncü and Akyildirim (2016a) assume that the marginals of the Lévy process

{Lt}t≥0 follow the generalized hyperbolic distribution. The authors consider three cases of pairs trading strategies. We summarize them in one framework as fol- lows. A trader opens a position when the spread crosses threshold a and closes the position at threshold m, a < m.

90 Review of stochastic differential equations in statistical arbitrage pairs trading

At any time T , the return from this trade is

ma-≤, tma, T ra(,mT,)= (24)  Xa, T  Tm->t ,a with τm,a = inf{t > 0|Xt = m, X0 = a}. The investor maximizes the expected value of these trading profits

max(PTttma,,<-)(ma)(+-1 PT()ma <-)[EXTmaTt ,a > ] (25) am, {}

The authors estimate the first passage time probabilityP (τm,a < T ) via Monte Carlo simulation since there is no closed form solution for the first passage time density of the Lévy driven OU process available. Based on these Monte Carlo es- timations, they solve problem (25) and provide trading signals in form of profit maximizing thresholds. In contrast to the baseline approach by Bertram (2010b), the authors extend the objective function by considering the probability of the spread’s divergence (see equation 24). However, the authors do not consider transaction costs in their strategy. Endres and Stübinger (2019b) model the spread dynamics according to equa- tion (21). An investor opens positions at threshold a and closes at m, a < m. The authors follow Bertram (2010b) and maximize the expected return per unit of time in their strategy ra(,mc,) max am, (26) E[]tma, for return r(a, m, c) = m − a − c and transaction costs c. Decomposing the process {Lt}t≥0 into two parts – {Qt}t≥0, representing the downward jumps and the Brownian motion, and {Rt}t≥0, representing the upward jumps – there is an explicit representation of the expected first-passage time E[τm,a] available. Based on this result, problem (26) can be solved directly and the optimal trading bounds a and m are obtained. In contrast to Göncü and Akyildirim (2016a), no Monte Carlo methods are needed for solving the optimization problem.

3.2.2. Optimal timing of trades – dynamic programming Larsson et al. (2013) study the problem of optimally closing a pairs trade when the spread Xt is modeled by an OU process extended with a jump component (see equation 23). Assuming the investor has already opened a position in the spread, the authors consider the following optimal stopping problem

Vx()= sup[EXt] (27) tt≤ a

91 Sylvia Endres

for a stop-loss level a < 0 and τa = inf{t ≥ 0 : Xt ≤ a}. In case the spread falls be- low a, the position is closed in order to limit potential losses. Applying stochastic control theory, the authors analyze a numerical method for solving the free bound- ary problem (27). In the context of Lévy driven OU models, Larsson et al. (2013) provide the first study dealing with a stochastic control problem in pairs trading. The incorporation of stop-loss levels reduces the risk of model misspecification and potential failure of the optimal stopping problem (see Yoshikawa 2017). To summarize the subsection on Lévy driven OU models, it should be pointed out that analytic formulas have not yet been derived in this context. Since the process has recently attracted attention, e.g., due to the presence of jumps in high-frequency data (see Jondeau et al. 2015), future work should build upon the aforementioned studies and provide further closed-form results.

3.3. Regime-switching Ornstein–Uhlenbeck model Pairs trading strategies typically rely on the spread’s convergence to a specific mean. Switches of this mean between different levels lead to failure of traditional trading approaches in practice (see Bock and Mestel 2009). Speaking more gener- ally, it is unrealistic to assume constant model parameters in the long run. This drawback is eliminated by regime-switching models – they consider the change of parameters over time based on the theory of Markov processes. The spread in the regime-switching OU model follows

qm11()-+Xdtttds1 WR, for t = 1  dXt =  (28)  qmrr()-+Xdtrtds WRtt, for = r and switches between r different regimes (r ∈ N). The continuous-time Markov chain {Rt}t≥0 describes the regime switching behavior and allows for r different 11 states. The random variable Rt denotes the state of the process at time t . Table 3 provides an overview of the studies applying a regime-switching OU model to capture the spread dynamics. To date, there exist only four works in that context.

3.3.1. Optimal trading thresholds – analytic approach Bai and Wu (2018) address an optimal investment problem based on a regime- switching OU model with r different states. The spread Xt evolves according to

1 1 In the special case of only one state, i.e., r=1, the regime-switching model (28) reduces to the clas- sic OU process.

92 Review of stochastic differential equations in statistical arbitrage pairs trading

equation (28). The authors give a closed-form expression for the pairs trading value function, i.e., the expected return per unit of time. This is basically the same objective function as in Bertram (2010b) (see equation 6), with the difference that r possible states are considered. The authors derive analytic solutions for systems of ordinary differential equations and obtain an optimum of the value function with trading thresholds a and m, a < m. A trade is entered, when the spread Xt crosses one of the thresholds. However, explicit analytic solutions are only presented for r = 1, i.e., one regime, which is the case without any switch- ing already considered in Bertram (2010b). Further, the numerical analysis is executed only for one-state and two-state regime switching models. For future research, it would be interesting to see how the approach works for more than two regimes. Moreover, as the authors themselves point out, the pairs trading rule is static – further research could construct dynamic optimal thresholds that are more suitable for practical applications.

3.3.2. Optimal investment allocation – dynamic programming Altay et al. (2017) focus on dynamic portfolio optimization and derive optimal portfolio holdings for an investor with logarithmic utility. The spread Xt is modeled by an OU process with Markov modulated mean-reversion level, i.e., the mean level switches across r (r ∈ N) different regimes. For a dollar-neutral portfolio of two stocks with weights h1(t) = −h2(t), the authors maximize the expected utility from terminal wealth, penalized by the riskiness of the portfolio. The optimal weight is derived via pointwise maximization as 1 ()x 1 *  qmi - rh  ht1(,xi,)=  +-,,ir∈{}1 ..., 1 + e  s2 s 2  for risk factor e > 0, correlation ρ, and volatility η of stock 1. It is an interesting approach to prevent the trader from taking risky positions by penalizing the objective function by the realized volatility. Further, the important problem of a constant mean level is addressed, which is unrealistic in practice but still as- sumed by many studies (see Bock and Mestel 2009). However, there is one main downside associated with the model: In contrast to Bai and Wu (2018), only the mean level is allowed to switch between the different regimes – all other model parameters, i.e., mean-reversion speed and volatility, remain constant.

3.3.3. Other approaches This subsection discusses other approaches based on the regime-switching OU model beyond analytic and dynamic programming frameworks. Model-based trading rules. Yang et al. (2016) assume that the spread follows equation (28) and is divided into two states: Rt = “high” and Rt = “low”. All model

93 Sylvia Endres

parameters, i.e., µ, θ, and σ, vary across the two regimes. In the formation period, the authors select pairs by smallest sum of squared differences. At this point, a simple yet promising improvement would be to optimize pairs selection by choosing pairs based on model parameters instead of simple distance metrics. In the trading period, three rules are implemented – two classic benchmark rules and one rule based on the regime-switching model. In the latter case, the spread value is predicted one step ahead from the calibrated model – if it deviates by more than 1.96 standard deviations from its observed value, a position is opened. The critical value of 1.96 is chosen in accordance with the 95% confidence interval for the normal distribution. Regime classification algorithm. Endres and Stübinger (2019a) develop a regime-switching framework for Lévy driven OU models with different regimes.

They specify the spread Xt by equation (28), providing the first study that replaces the Brownian motion by the Lévy process {Lt}t≥0 in a regime-switching frame- work. As such, the model is capable of explaining fat tails and jumps within the individual regimes. All model parameters are allowed to vary across the states. Hereby, the number of states is not determined in advance. The authors develop a regime classification algorithm that allows the process to switch between a flex- ible number of regimes. This is in contrast with Yang et al. (2016), who assume r = 2 regimes in their study, as well as Altay et al. (2017) and Bai and Wu (2018), who assume a fixed number of regimes. To summarize, relatively few studies consider the change of parameters over time based on regime-switching models. It should be investigated whether switches of parameters are recognized quickly enough in these models to appropriately adjust the trading rules (see Bogomolov 2013).

4. Advanced mean-reverting diffusion models

Mean-reverting diffusion models beyond the OU process are Cox–Ingersoll– Ross (CIR), inhomogeneous geometric Browian motion (IGBM), and stochastic volatility (SV) models. These models have a drift term identical to the OU process, while the diffusion term is extended – as such, the disadvantage of constant vola- tility is overcome (see, e.g., Stehlıková 2008, Chanol et al. 2015). The relevant studies are summarized in Table 4.

4.1. Cox–Ingersoll–Ross model The CIR process overcomes the constant volatility assumption of the OU process, and still exhibits the important mean-reverting property. However in this model, analytic calculations become more challenging.

94 Review of stochastic differential equations in statistical arbitrage pairs trading

The spread follows

dXt = θ(µ − Xt)dt + σ √XtdWt (29)

with parameters µ, θ, σ > 0 and standard Brownian motion {Wt}t≥0. The diffu- sion term σ√Xt is extended compared to the OU process. In case the process approaches zero, the mean-reverting drift term dominates the decreasing diffu- sion term, ensuring nonnegativity of the process values. However, Avonleghi and Davison (2017) point out that negative price spreads are an essential modeling feature – the non-negativity assumption actually stems from modeling interest rates, volatility, and prices (see, e.g., Heston 1993) rather than spreads. In practice, the disadvantage of non-negativity might present no issue if the long-term mean of the spread is far enough from zero.

Table 4 Advanced mean-reverting diffusion models

Model Study 4.1. Cox–Ingersoll–Ross model Analytic approach Optimal trading thresholds: Maximum expected Gregory et al. (2010) return and Sharpe ratio per unit of time Dynamic programming Optimal timing of trades: Optimal starting- Li (2015), Leung and Li (2016), stopping problem, finite and infinite trading Kitapbayev and Leung (2018) horizons 4.2. Stochastic volatility model Dynamic programming Optimal timing of trades: Maximum reward Ngo and Pham (2016), Kitapbayev function, optimal switching problem and Leung (2018) 4.3. Inhomogeneous geometric Brownian motion model Analytic approach Optimal trading thresholds: Maximum expected Gregory et al. (2010) return and Sharpe ratio per unit of time Dynamic programming Optimal timing of trades: Maximum reward Ngo and Pham (2016), Kitapbayev function, optimal trading boundaries and Leung (2018)

95 Sylvia Endres

4.1.1. Optimal trading thresholds – analytic approach Gregory et al. (2010) model spread and ratio dynamics according to the CIR model of equation (29). Following Bertram (2010b), the expected trade return µ(a, m, c) and the variance of trade returns σ2(a, m, c) are determined by equations (6) and (8). With the aid of first passage time results for CIR processes, this leads

  dFF 22qm q  d  22qm q   0,,a 0,,m    22 -  2 2  +  dann s s  da  s s   mq(,am,)cr= (,am,)c /    (30)  dYY 22qm q  d  22qm q     0,,m - 0,,a    22  22    dann s s  da  s s  

2 and a return variance σ (a, m, c) derived similarly. Φ(an, b, z) and Ψ(an, b, z) denote the Kummer’s and Trinomi function (see Zhao 2009). For the spread

Xt = ln S1(t) − ln S2(t), the trade return is r(a, m, c) = m−a−c and for the ratio

St1() m a Xt = it is r˜(a, m, c) = (e −e −c). From these expressions, the authors St2() determine the optimal trading strategy by maximizing expected return and Sharpe ratio numerically, obtaining the optimum trade levels a and m. The construction has one major advantage: Traditional rules of thumb are replaced by optimal trad- ing signals. However, and in contrast to subsection 2.1.2, no closed-form expres- sions for the thresholds a and m are provided. From an empirical perspective, it would be interesting to discuss the non-negativity assumption and suitability of the CIR process for spread modeling.

4.1.2. Optimal timing of trades – dynamic programming Li (2015), Leung and Li (2016), and Kitapbayev and Leung (2018) analyze the optimal timing to open and subsequently close pairs trading positions. Capital is allocated to a mean-reverting portfolio evolving according to a CIR process (see equation 29) or a risk-free asset with rate r > 0. For an investor who already holds a position Xt, e.g., one long and one short asset, the aim is to maximize the position’s expected discounted value. As such, the investor solves the following optimal stopping problem:

-rt Vx()=-sup[Ee ()XcT sell ] (31) t∈T

If the position is closed at τ, the investor receives the value Xτ and pays transaction costs csell > 0. The entry timing is determined in the optimal starting- ‑stopping problem:

-rn Jx()=-sup[Ee ((VXnn))Xc- buy ] (32) n∈T

96 Review of stochastic differential equations in statistical arbitrage pairs trading

If the position is opened at time ν, the investor pays transaction costs cbuy > 0 plus the process value Xν and receives the expected value V (Xν) from optimally stopping the process. Li (2015) and Leung and Li (2016) give analytic expressions for the value functions (31) and (32) with regard to hypergeometric functions and characterize the corresponding optimal trading boundaries. In contrast to the majority of studies, the transaction costs cbuy and csell for entry and exit are allowed to differ. However, the trader has an unlimited amount of time for open- ing and closing positions. In practice, it might not be feasible that positions are held infinitely. This issue is only partly addressed by discounting the portfolio value with the risk-free rate r. In this respect, Kitapbayev and Leung (2018) en- hance problems (31) and (32) for finite deadlines to enter and exit the market. However and in contrast to problems with infinite trading horizon, closed-form solutions cannot be derived.

4.2. Stochastic volatility model

In the stochastic volatility (SV) model, the spread Xt follows

dXt = θ(µ − Xt)dt + σ(t, Xt)dWt (33)

with a general diffusion term σ(t, Xt). The SV model is popular because of its flexibility, resolving the issue of constant volatility of the OU model and enabling better matching to real world data (see, e.g., Ackerer et al. 2018). Further, the model includes prominent special cases, i.e., the OU model for σ(t, Xt) = σ, the CIR model for σ(t, Xt) = σ√Xt, and the IGBM model for σ(t, Xt) = σXt. Table 4 provides an overview of the relevant studies using a SV model to capture the spread dynam- ics. As opposed to the previous sections, no analytic approach has been published.

Ngo and Pham (2016) explain the spread Xt by the mean-reverting stochas- tic volatility model (see equation 33). Pairs trading is considered as a switching problem between three regimes: no holding of stocks (i = 0), one stock long the other short (i = 1), and vice-versa (i = −1). The trading strategy is modeled by

α = (τn, ιn)n≥0, consisting of a sequence of trading times (τn)n and ιn ∈ {−1, 0, 1} representing the flat, long, or short position. The investor maximizes the expected discounted cumulative gain penalized by the holding of assets over an infinite horizon:

 ∞  -rtn -rtn na()xE=-sup( egXet ,,- alt ) at dt (34) ∑ n tn n ∫0 a  n≥1  for gain function g, discount factor ρ > 0, initial spread value X0 = x, and penal- izing factor λ ≥ 0. The authors are able to directly derive the structure of the

97 Sylvia Endres

switching regions and the form of the value function ν via partial differential equations. Hereby, the solutions ψ+ and ψ− of the following system need to be determined

1 Ljq()xx=-()mj′()xx+ sj2()′′()x 2 (35) rf -=L f 0

The solutions of (35) specify the switching regions where trading deci- sions are executed. Interestingly and in contrast to the majority of studies, the optimization problem (34) is extended for no holding of stocks in addition to classic long and short positions. Hereby, the flat position is explicitly penalized with a factor λ. Kitapbayev and Leung (2018) consider a similar problem as Ngo and Pham (2016). The two main differences are (i) Ngo and Pham (2016) focus on cumulated rewards gained by multiple trades, Kitapbayev and Leung (2018) consider the discounted reward of one trade cycle, (ii) Ngo and Pham (2016) consider a problem over infinite horizon, Kitapbayev and Leung (2018) integrate a finite deadline to enter and exit the market – however, this deadline does not allow for closed-form solutions any more.

4.3. Inhomogeneous geometric Brownian motion model

In the mean-reverting IGBM model (also called GARCH diffusion, GBM with affine drift), the spread Xt evolves according to

dXt = θ(µ − Xt)dt + σXtdWt (36)

The diffusion term σXt is non-constant and depends on the process value itself. Increasing Xt leads to an increasing volatility. The model is a special case of the stochastic volatility model (see equation 33). Table 4 summarizes the relevant works applying an IGBM model to explain the spread dynamics.

4.3.1. Optimal trading thresholds – analytic approach Gregory et al. (2010) model spread and ratio dynamics by the IGBM model (see equation 36). As described in subsection 4.1.1 for the CIR process, the authors follow Bertram (2010b) and derive closed-form expressions for two pairs trading objective functions – expected return µ(a, m, c) and Sharpe ratio

S(a, m, c, rf ) of the strategy.

98 Review of stochastic differential equations in statistical arbitrage pairs trading

It is

m(,am,)c =  m a dF  22q qm   ln ln 02,,  ++ + 22 -  ra(,mc,) a m dan  s s a  (37) =   22()qs+ 2  dFY 22q qm  d  22q qm  dY  22q qm  02,, 02,, 02,,   + 22 ++ 22 -+ 22  dann s s m  da  s s m  dan  s s a  and S(a, m, c, rf ) similarly. These objective function are maximized numerically to obtain the optimum entry and exit bounds a and m. Direct optimization via differentiation has only been accomplished for the classic OU process in Bertram (2010b) yet.

4.3.2. Optimal timing of trades – dynamic programming

Ngo and Pham (2016) explain the spread dynamics Xt by the IGBM model (see equation 36). The authors identify optimal trading boundaries for a pairs trading strategy that aims at maximizing the expected reward (see equation 34). The two fundamental solutions for problem (35) in case of the IGBM model are given by

-aa c  -  c  yy+()xx= YF ab,, ,(- xx),=  ab,   x   x  with Φ and Ψ the confluent hypergeometric function of first and second kind and

sq22++42()rs +-42qq22()+ s 2q 2qm ab= ,,=+22ac+= 2s22s s2

Based on these solutions, the structure of the switching regions for the IGBM model is directly specified. Kitapbayev and Leung (2018) consider a similar prob- lem – compared to the case of a stochastic volatility model (see subsection 4.2), the optimal trading boundary problem simplifies for the special case of the IGBM model. To summarize the section on advanced mean-reverting diffusion models, there remains a lack of studies that investigate the additional benefits of the CIR, IGBM, and SV models in pairs trading. It has not been clearly studied how spread modeling as well as pairs selection and trading can be improved with these models.

5. Diffusion models with a non-stationary component

The vast majority of studies models the spread as a stationary mean-reverting process. However, spreads may also exhibit non-stationary effects (Bertram 2009,

99 Sylvia Endres

Bertram 2010a, Focardi et al. 2016, Tie et al. 2017). As such, models based on stationary time series are not suitable for reliable forecasting and decision making (Bertram 2010a). Besides extending the classic OU process by jumps (see subsec- tion 3.2) and regime-switches (see subsection 3.3), there exist other possibilities to incorporate non-stationarity, which are summarized in Table 5 and discussed in the following subsections.

Table 5 Diffusion models with a non-stationary component

Model Study 5.1. Geometric Brownian motion model Dynamic programming Optimal timing of trades: Sequential buying and selling times determined by threshold Tie et al. (2017) curves Optimal investment allocation: Optimal Fouque et al. (2016) portfolio holdings under parameter ambiguity 5.2. Two-factor model Analytic approach Optimal trading thresholds: Maximum Bertram (2010a), Gu and Steffensen expected return and Sharpe ratio per unit of (2015) time 5.3. General Itô diffusion model Analytic approach Optimal trading thresholds: Maximum expected rate of profit under drawdown Bertram (2009) constraint

5.1. Geometric Brownian motion model According to Tie et al. (2017), limiting research to mean-reverting spreads adds a strong restriction on pairs trading strategies and their application. To fulfill the mean-reversion requirement, matching of stocks is typically executed among assets of the same industry. From a practical point of view, there is a broader range of potential pairs with desirable properties beyond the mean-reverting pairs. Instead of classic spread modeling, two stocks can be modeled separately

100 Review of stochastic differential equations in statistical arbitrage pairs trading

by geometric Brownian motions (GBM), dropping the typical mean-reversion requirement. The risky assets with prices S1(t) and S2(t) follow

 dS1()t   St1() 0  m1dt   ss11 12  dW1()t    =    +    (38)  dS2()t   0 St2()m2dt  sss21 22  dW2()t  with two-dimensional standard Brownian motion W (t) = (W1(t), W2(t)). The cor- relation between S1(t) and S2(t) is regulated by σ12 = σ21. In other notation, the risky assets follow

dSi(t) = µiSi(t)dt + σiSi(t)dWi(t), i = 1, 2 (39)

with W (t) = (W1(t), W2(t)) and dW1(t)dW2(t) = ρdt, ρ ∈ [−1, 1]. Table 5 provides an overview of the studies applying a GBM model to explain the spread dynamics. As opposed to the previous sections, no analytic approach has been published.

5.1.1. Dynamic programming

Optimal timing of trades. Tie et al. (2017) explain the stock prices S1(t) and

S2(t) based on GBM models as specified in equation (38). In their pairs trading strategy, the authors maximize an overall return by sequential and simultane- ous trading of stock pairs. It is assumed that the initial position in the spread

Xt = S1(t) − S2(t) can either be flat (i = 0) or long (i = 1). For τ0 ≤ τ1 ≤ τ2 ≤ ..., pairs are sold at stopping times τ0, τ2, ... and bought at τ1, τ3, ... respectively. A trading sequence is denoted by Λ0 = (τ1, τ2, ...) for i = 0 and Λ1 = (τ0, τ1, τ2, ...) for i = 1.

For transaction costs c, initial position i = 0, discount factor ρ > 0, cb = 1 + c, and cs = 1 − c, the reward function J0(s1, s2, Λ0) is

Ee[(-rt2 cS()cS())(11ec-rt1 Sc() S ())1 sb12tt--22 {}t2<∞ bs11tt- 21 {}t1<∞

ec-rt4 ((Sc)(Se))1 -rt3 ((cS ) c S ())1 ...] +-sb14tt24 {}t4 <∞ --bs13t 23t {}t3<∞ + and for i = 1 it is J1(s1, s2, Λ1) calculated similarly. The investor solves the optimal control problem

Vsii()12,sJ==sup(ss12,,Li ), i 01, Li The authors characterize the optimal policy, i.e., the optimal entry and exit times, by threshold curves obtained with dynamic programming techniques.

These curves constitute three regions: selling zone Γ1, holding zone Γ2, and buying zone Γ3. A larger correlation between stocks 1 and 2, measured by the

101 Sylvia Endres

parameter σ12, leads to greater buying and selling zones and thus more trad- ing opportunities. Tie et al. (2017) provide a clear value-add to pairs trading research – the authors relax the the typical mean-reversion requirement of the spread. Thus, the model includes a broader range of potential pairs with desir- able properties – among them still the classic mean-reverting pairs. From a prac- tical perspective, this offers wider selection possibilities and new opportunities in pairs trading. However, there is a downside associated with the GBM model to explain the spread dynamics: As in the classic OU model, the assumption of constant volatility or correlation is not always appropriate (see, e.g., Mota and Esquivel 2016, Avonleghi and Davison 2017). For futher research, Tie et al. (2017) suggest to use more realistic models for stock dynamics, e.g., extend the GBMs with regime switching12. Optimal investment allocation. Fouque et al. (2016) study an investment allocation problem, in which an investor trades a risk-free asset Mt or two risky assets. The amounts invested in the assets are denoted by hi(t), i = 0, 1, 2. The two risky assets follow the GBM model (see equation 39) with ambiguous cor- relation ρt. The wealth of the investor is defined as Xt = h0(t) + h1(t) + h2(t) and evolves according to

dXs = [rXs + β′hs]dt + [σ1h1(s) σ2h2(s)]dWs, Xt = x

for interest rate r, h = [h1 h2]′, and excess return vector β. The investor maximizes the expected utility in the worst-case scenario

supinf Eu[(XXTt)]= x (40) r Li The authors derive an analytic solution of problem (40) including closed- form solutions for power and exponential utility. Depending on the covariance matrices between the two risky assets and the corresponding variance risk ratios, the market is in favor of classic pairs trading or directional trading. Hereby, the portfolio selection is robust to the uncertain correlation. In case of low correlation between the risky assets, spread trading is not optimal. This is in line with the findings of Tie et al. (2017) – higher correlation encourages greater pairs trading activity. Fouque et al. (2016) extend their approach to stochastic volatility models with ambiguous correlation and derive an asymptotic closed-form solution. The authors provide new directions for continuous-time pairs trading in three respects.

1 2 Göncü and Akyildirim (2014), Göncü (2015), and Göncü and Akyildirim (2017) analyze the exist- ence of statistical arbitrage based in GBM models for single stocks with extensions for jumps and multi-asset frameworks. Their results could be extended for spreads, i.e., portfolios of two stocks, one long and one short.

102 Review of stochastic differential equations in statistical arbitrage pairs trading

First, they present the first academic study that considers parameter ambiguity in their model. Second, they apply a trading strategy that comprises not only classic pairs trading but also other trading variants. Third, and in contrast to Tie et al. (2017), they extend their problem to stochastic volatility models, hereby resolving the issue of constant volatility in the GBM model.

5.2. Two-factor model According to several studies (see, e.g., Schwartz 1997, Schwartz and Smith 2000, Farkas et al. 2017), more than one factor is necessary to describe uncertain price dynamics appropriately. The two-factor model in pairs trading has been proposed by Bertram (2010a). The spread Xt is modeled as the sum of a stationary com- ponent Yt, driving mean-reversion effects in the short-run, and a non-stationary component Mt, representing the long-term behavior. Hereby, Yt is modeled by an

OU process, and Mt by an arithmetic Brownian motion. The spread follows

XYtt=+Mt

dYtt=-qsYdtd+ Wt (41)

dMtt=+ghdt dZ with Brownian motions {Wt}t≥0 and {Zt}t≥0. This non-stationary model allows for mean-reversion of the spread around a stochastic mean level. Table 5 summarizes the relevant works applying a two-factor model to capture the spread dynamics. Only two studies in this context have been published yet.

They key work is provided by Bertram (2010a) – the spread Xt is explained by the two- factor model of equation (41). The author derives the optimal ana- lytic pairs trading strategy under the framework described in subsection 2.1.2, i.e., trades are entered and exited when the spread crosses trading thresholds a and m, a < m. For zero drift, i.e., γ = 0, and independent Browian motions

{Wt}t≥0 and {Zt}t≥0, the author calculates analytic solutions for the expected return µ(a, m, c), variance σ2(a, m, c), and Sharpe ratio S(a, m, c) of the strategy. The obtained results are almost identical with those for the classic OU process. The re- turn is derived as 2q()ma--c m(,am,)c = (42)   maq   q  p Erfi - Erfi         s   s  and only differs from µ(a, m, c) for the OU model (equation 9) in a factor of 2. The only difference in the variance σ2(a, m, c) compared to the classic OU model

103 Sylvia Endres

is an additive term. In empirical examples, the author analyzes the influence of the non-stationary noise, represented by the parameter η, on the trading strategy. When maximizing the expected return, the non-stationary behavior does not have any influence on the thresholds. When maximizing the Sharpe ratio, increasingη leads to strongly widened optimal trading bands – the strategy requires a higher return per trade to compensate the increasing variance. These results emphasize two major aspects. First, the trading strategy is strongly affected by non- stationary behavior. Second, the choice of a suitable objective function is highly important. Summarizing, the model by Bertram (2010a) provides a novelty in continuous-time pairs trading – it has the advantage that non-stationary effects can be captured and the model still exhibits mean-reverting behavior. However, the risk-free rate rf in the Sharpe ratio is assumed to be zero. Further, the author assumes zero drift and uncorrelated Brownian motions when deriving analytic formulas – as such, the model properties reduce to those of the classic OU process. It would be interesting to consider the model with non-zero correlation as it is studied for commodities in Schwartz and Smith (2000). Gu and Steffensen (2015) also pro- pose a two-factor model, where the spread reverts to a stochastic mean level µ(t). However, the authors do not further specify trading rules based on this model.

5.3. General Itô diffusion model In the general Itô diffusion model, the spread follows

dXt = µ(t, Xt)dt + σ(t, Xt)dWt (43)

with drift term µ(t, Xt) and diffusion term σ(t, Xt). In this model, the spread is not necessarily mean-reverting. The powerful stochastic differential equation (43) incorporates the majority of stochastic spread models as special cases, i.e., classic OU, CIR, SV, IGBM, GBM, nonlinear mean-reverting, and skew mean-reverting model. To the present day, the only study considering a general Itô diffusion model to explain the spread dynamics is provided by Bertram (2009) (see Table 5). Bertram (2009) examines the construction of an optimal analytic trading strategy under a drawdown constraint when the spread follows a general Itô diffu- sion process (see equation 43). Positions are entered and exited when the spread crosses the boundaries a and m, a < m – an analog construction is described in subsection 2.1 for the classic OU process. The total trade time τ is the time τ1 taken from entry to exit plus the time τ2 taken to enter a new trade. The optimal strategy is determined by maximizing the expected rate of profit

∞ 1 mt(,am,)cr==(,am,)cE[/1 ](ma--c)(ft;,ma)dt (44) ∫0 t

104 Review of stochastic differential equations in statistical arbitrage pairs trading

Hereby, the density f of the total trade time needs to be specified. The cu- mulative distribution function for the first passage time has the following form

m 1 Gt[,am](,xt00)(=-1 px,,tx00td) t ∫a t for transition densities p(x, t|x0, t0) satisfying the Fokker–Planck equation 2 ∂px(,t) ∂ ∂  1 2  =- ((msxt,)px(,t)) +  (,xt)(px,)t  (45) ∂tx∂ ∂ x2  2  Since the total trading time can be expressed in terms of Fokker–Planck equa- tions, i.e., partial differential equations, the optimization problem can be solved numerically without necessity of simulation based methods. As opposed to the classic OU process, the general Itô diffusion does not allow for analytic results. The author presents two measures for the trade drawdown, i.e., the maximum negative market-to-market return during a trade, and suggests to incorporate them as constraints in the optimization problem. Summarizing, Bertram (2009) applies a powerful and flexible stochastic spread model, which includes a mul- titude of special cases. Hereby, the study reveals the following directions for further research. First, considering trade drawdown in the strategy is promising and has yet been integrated only by Bertram (2009) and Temnov (2015). Second, the author himself suggests to apply the strategy to the continuous-time random walk (CTRW), which is more realistic since it is non-Gaussian13.

6. Other models

This section summarizes other models with a limited set of supporting lit- erature. The respective studies are listed in Table 6.

6.1. Doubly mean-reverting model Liu et al. (2017) introduce a “doubly mean-reverting” model for intraday trading strategies, which is constructed as follows. In the first step, the long-term stochastic trend of the spread Xt, denoted by Lt, is modeled as a stochastic process.

Specifically, Lt follows an OU process with mean 0

L dLtL=-qsLdtLtd+ Wt (46)

1 3 Osmekhin and Déleze (2015a) and Osmekhin and Déleze (2015b) use a CTRW for spread modeling. In this way, they overcome the drawback of the normal distribution. Their model can be seen as a generalization of the GBM with fat tails.

105 Sylvia Endres

Table 6 Other models

Model Study 6.1. Doubly mean-reverting model Analytic approach Model-driven decisions: Optimizing formation and Liu et al. (2017) trading period 6.2. Nonlinear mean-reverting model Dynamic programming Optimal investment allocation: Optimal portfolio Alsayed and McGroarty (2013) holdings under logarithmic utility Other approaches Principal component analysis: Multi-factor approach Avalon et al. (2017) 6.3. Skew mean-reverting model Analytic approach Trading framework: Skewness, heavy tails, Avonleghi and Davison (2017) and nonlinear mean-reversion

In the second step, Xt is modeled via a mean-reverting process around Lt us- ing the conditional modeling technique. Specifically, the spread Xt follows

dXtt=-qs((Lt ))Xdtd+ Wt (47) LL+ with mean process Lt()= 22ii--21 and daily opening and closing values L 2 2i−2 and L2i−1 of Xt for day i, i = 1, ..., N. As such, it is assumed that during any trading day, the spread reverts to the average of the current day’s opening value and the previous day’s closing value – a plausible, yet very restrictive assumption. The model is justified from a pairs selection perspective – pairs with stable Lt and volatile Xt are suitable pairs for trading. Liu et al. (2017) optimize the pairs trading strategy based on the calibrated model of equations (46) and (47) in two respects. In the formation period, pairs are selected by high short-term and low long-term model variance. Trading can- didates thus produce many trading opportunities while being stable in the long- run. In the trading period, a pairs trade is opened, when the spread deviates by from Lt, where is the 98% percentile of the absolute daily change in Lt in the past 100 days. The position is closed when the spread reverts back to its mean level. Compared to existing studies, Liu et al. (2017) include two main improvements

106 Review of stochastic differential equations in statistical arbitrage pairs trading

in their model. First, the spread reverts around a stochastic mean instead of a constant level or a linear function of time. Second, the stochastic long-term mean is itself a mean-reverting process, which distinguishes their work from the two-factor models in subsection 5.2. For further research, there are three main issues. First, future work should calibrate the stochastic mean level to a more complex function or process, replacing the simple step function calculated from daily opening and closing values. Second, the threshold is parametric and should instead be optimized by the model. Third, the authors themselves suggest to replace the Brownian motion by a more general Lévy process.

6.2. Nonlinear mean-reverting model In a nonlinear generalization of the classic OU process (see Wong 1964,

Alsayed and McGroarty 2013), the spread Xt evolves according to q dX =- tanh((cX-+ms))dt dW (48) ttc t with parameter c > 0 modeling the nonlinearity of mean-reversion. For c → 0, the model reduces to the OU process with mean-reversion rate θ and mean level µ. The nonlinear model of equation (48) differs from the classic OU model as follows. In the classic OU model, the mean-reversion strength increases linearly when the spread deviates from its mean. Consequently, the larger the deviations are, the more attractive the investment opportunities become. This is dangerous if spreads deviate far from their mean and the mean-reverting property actually does not persist (see Avalon et al. 2017). In the nonlinear OU model instead, the mean-reversion rate increases less when the spread deviates from its mean. Consequently, large deviations become more risky and less interesting. Table 6 provides an overview of the relevant works applying a nonlinear mean-reverting model to capture the spread dynamics. As opposed to the previ- ous sections, no analytic approach has been published. Optimal investment allocation. Alsayed and McGroarty (2013) solve a port- folio optimization problem for an investor that allocates capital to a spread Xt, following equation (48), or a risk-free asset with rate r. The number of units held in the spread is denoted by ht. An investor with logarithmic utility solves the fol- lowing portfolio optimization problem:

max[EVln t ] ht with wealth Vt. The optimal strategy is derived in closed form as follows: --q tanh((cX m)) - rX hX(,V ) = c ttV (49) ttt s2 t

107 Sylvia Endres

An analog specification of the optimal portfolio holdings for the classic OU model is derived in Jurek and Yang (2007), see equation (12). Due to the nonlinearity of mean-reversion in the nonlinear model, the capital allocation to diverging spreads is reduced compared to the classic OU model – losing trades are unwound sooner and the optimal portfolio holdings (see equation 49) decrease. Alsayed and McGroarty (2013) present the first academic study that explains the mean-reverting spread in a nonlinear model. As such, they provide an interesting new direction in pairs trading, yielding insights into the behavior of investors, especially concerning the capital allocation when large mispricings occur. Principal component analysis. Based on the nonlinear OU process (48), Avalon et al. (2017) use a multi-factor statistical arbitrage model which incor- porates methods such as principal component analysis and k-means clustering. To calculate the optimal portfolio allocation to a pair, the authors use equation (49) for the nonlinear OU process and equation (12) derived by Jurek and Yang (2007) for the classic OU process as a benchmark. Compared to previous studies, the authors do not rely only on price data but also include time-series data on fundamentals in order to select better pairs for trading. As such, key contribu- tions in this promising framework are the consolidation of multiple approaches and the integration of additional fundamental factors beyond classic price data.

6.3. Skew mean-reverting model Avonleghi and Davison (2017) introduce another nonlinear generalization of the classic OU process. The spread Xt follows  X  dX =-qm t dt + sdW (50) t   t  2 + Xt  and is mean-reverting for −1 < µ < 1 with mean-reversion rate θ, volatility σ, standard Brownian motion {Wt}t≥0, and a long-run mean depending on all model parameters. According to Avonleghi and Davison (2017), the two main advantages of the model for pairs trading are the following. First, it is capable of capturing kurtosis and skewness in the transition density of the spread – both well-known stylized facts of financial data. Second, the strength of mean-reversion does not increase linearly as the process diverges away from its equilibrium level – in- stead, a slower increase of the mean-reversion strength compared to the classic OU process reflects the risk of diverging spreads. Despite the model’s flexibility compared to the classic OU process, it does not have an analytic transition density and numerical methods need to be used for parameter estimation. In the pairs trading literature, this flexible model has only been applied once by Avonleghi and Davison (2017) (see Tab. 6). The authors provide analytic results

108 Review of stochastic differential equations in statistical arbitrage pairs trading

concerning the solution of the process as well as its stationary distribution. How- ever, the development of a specific trading strategy is left for further research.

7. Conclusion

We have provided a comprehensive literature review of stochastic differential equations in statistical arbitrage pairs trading. The key findings and potential gaps regarding the five major spread models can be summarized as follows. Section 2 covers the OU model to explain the spread dynamics – the majority of surveyed studies uses this model. The advantages are tractable analytic solutions, e.g., explicit trading thresholds derived by Bertram (2010b), optimal portfolio weights derived by Jurek and Yang (2007) and Mudchanatongsuk et al. (2008), and the ability to model mean-reversion. However, stylized facts of financial return series, especially in high-frequency settings, require extended stochastic spread models beyond the classic OU model. Section 3 focuses on extended OU models. Correlations among multiple spreads are incorporated in multivariate OU frameworks. To the present day, there remains a lack of studies in this context, even though multivariate model- ing is highly relevant from a practical perspective. The existence of jumps and fat tails is taken into account by Lévy driven OU models. Many studies confirm the non-normality of spread series, especially on small time scales (see Jondeau et al. 2015, Liu et al. 2017, Stübinger and Endres 2018). As such, Lévy driven OU models constitute an important extension of classic OU models driven by Brownian motion. However, in pairs trading they only emerged over the past few years. Research should aim at achieving further closed-form results for these Lévy driven OU models. Regime switching OU models are attractive since the allow parameters to vary across regimes. Just like the Lévy driven OU models, they only arose recently in continuous-time pairs trading and not much re- search has been done in that area. For further research, the suitability of such models in the high-frequency context should be analyzed. Bogomolov (2013) doubts whether it is possible to recognize switches and new parameters of the spread process quickly enough to adapt the trading strategy to changing market conditions. Section 4 covers advanced mean-reverting diffusion models, i.e., CIR, IGBM, and SV models, which provide increased flexibility in the diffusion term of the model while their drift term is identical to the OU process. However, there remains a lack of literature exploiting this advancement and justifying the selection of these models for pairs trading. It should be investigated whether the CIR model including the property of non-negativity is applicable for stock price spreads from

109 Sylvia Endres

an empirical point of view. Then, analytic formulas, e.g., provided by Gregory et al. (2010), should be applied to large empirical data sets. Further, the advanced mean-reverting diffusion models could be extended from a model-perspective, e.g., by incorporating Lévy noise or regime-switching. Section 5 focuses on diffusion models with a non-stationary component. Ber- tram (2009) introduces non-stationary spread models for continuous-time pairs trading to explain effects besides the classic mean-reversion property. However for non-stationary time-series, it is more difficult to distinguish extreme events, which trigger entry and exit decisions in pairs trading, from non-stationary shifts in the data (see Bertram 2010a). Recently, Fouque et al. (2016) and Tie et al. (2017) drop the classic mean-reverting property of pairs trading and model two stocks separately by GBMs. This provides a new direction in pairs trading and expands the range of potential pairs with desirable properties beyond the classic mean- reverting pairs. For further research, the GBM model could be expanded for a regime- switching component as proposed by Tie et al. (2017) or for a jump component as presented in Göncü and Akyildirim (2014). The two-factor model gains attention recently (see, e.g., Farkas et al. 2017 and Hahn et al. 2018), pro- viding additional flexibility by allowing for two stochastic factors, e.g., short-term variations and long-term behavior. The two factors can be estimated from two different data bases (see Schwartz and Smith 2000). As such, the model is par- ticularly interesting when considering additional data besides classic stock prices, e.g., sentiment, fundamentals, or cryptocurrency data. The general Itô diffusion model is highly flexible and includes many special cases – however, it has only been considered once in literature by Bertram (2009). Section 6 covers other models with a limited set of supporting literature. In recent years, several studies have exploited new models beyond the classic ones, e.g., the doubly mean- reverting model by Liu et al. (2017), the nonlinear mean- reverting model by Alsayed and McGroarty (2013), and the skew mean-reverting model by Avonleghi and Davison (2017). These models seem promising since they eliminate drawbacks of the classic OU process, which has been thoroughly discussed for more than a decade now. Further work should focus on the spread’s fat tails, especially in high-frequency settings, combined with potential further factors that drive price dynamics. Overall, we have comprehensively and critically reviewed the existing literature on stochastic differential equations in statistical arbitrage pairs trading – clearly, there remain gaps for further research and future studies. While the class of classic OU models has been extensively investigated, other models have only gained at- tention in recent years. Given the five main categories of stochastic spread models identified in this paper, there still exist other models apart from the aforemen- tioned, e.g., two- and three-factor models with additional stochastic components.

110 Review of stochastic differential equations in statistical arbitrage pairs trading

References [1] Ackerer, D., Filipović, D. and Pulido, S. (2018) ‘The Jacobi stochastic volatility model’, Finance and Stochastics, vol. 22, pp. 667–700. [2] Alsayed, H. and McGroarty, F. (2013) ‘Optimal Portfolio Selection in Nonlinear Arbitrage Spreads’, The European Journal of Finance, vol. 19, pp. 206–227. [3] Altay, S., Colaneri, K. and Eksi, Z. (2017) ‘Pairs trading under drift uncertainty and risk penalization’, Working Paper, Vienna University of Technology. [4] Angoshtari, B. (2016) ‘On the market-neutrality of optimal pairs trading strategies’, Working paper, University of Washington. [5] Avalon, G., Becich, M., Cao, V., Jeon, I., Misra, S. and Puzon, L. (2017) ‘Multi- factor statistical arbitrage model’, Working paper, Stanford University. [6] Avellaneda, M. and Lee, J.H. ( 2010) ‘Statistical Arbitrage in the U.S. Equities Market’, Quantitative Finance, vol. 10, pp. 761–782. [7] Avonleghi, M.H. and Davison, M. (2017) ‘Modeling energy spreads with a generalized novel mean-reverting stochastic process’, Journal of Energy Markets, Forthcoming. [8] Bachelier, L. (1900) Théorie de la spéculation, Paris: Gauthier-Villars. [9] Bai, Y. and Wu, L. (2018) ‘Analytic value function for optimal regime-switching pairs trading rules’, Quantitative Finance, vol. 18, pp. 637–654. [10] Baronyan, S.R., Bodurğlu, İ.İ., Şener, E. (2010) ‘Investigation of Stochastic Pairs Trading Strategies Under Different Volatility Regimes’, The Manchester School, vol. 78, pp. 114–134. [11] Baviera, R. and Baldi, T.S. (2018) ‘Stop-loss and leverage in optimal statis- tical arbitrage with an application to energy market’, Energy Economics, Forthcoming. [12] Bertram, W.K. (2009) ‘Optimal trading strategies for Itô diffusion pro- cesses’, Physica A: Statistical Mechanics and its Applications, vol. 388, pp. 2865–2873. [13] Bertram, W.K. (2010a) ‘A non-stationary model for statistical arbitrage trad- ing’, Working Paper, Columbia University, doi:10.2139/ssrn.1632718. [14] Bertram, W.K. (2010b) ‘Analytic solutions for optimal statistical arbitrage trading’, Physica A: Statistical Mechanics and its Applications, vol. 389, pp. 2234–2243, doi:10.1016/j.physa.2010.01.045. [15] Blázquez, M.C. and Román, C. (2018) ‘Pairs trading techniques: An empiri- cal contrast’, European Research on Management and Business Economics, vol. 24, pp. 160–167. [16] Bock, M., Mestel, R. (2009) ‘A regime-switching relative value arbitrage rule’, in Fleischmann, B., Borgwardt, K., Klein, R., Tuma, A. (eds.) Operations Research Proceedings 2008, : Springer, pp. 9–14. [17] Bodo, B.A., Thompson, M.E. and Unny, T.E. (1987) ‘A review on stochastic differential equations for applications in hydrology’, Stochastic Hydrology and Hydraulics, vol. 1, pp. 81–100, doi:10.1007/BF01543805.

111 Sylvia Endres

[18] Bogomolov, T. (2011) ‘Pairs trading in the land down under’, in Finance and Corporate Governance Conference, Melbourne. [19] Bogomolov, T. (2013) ‘Pairs trading based on statistical variability of the spread process’, Quantitative Finance, vol. 13, pp. 1411–1430, doi:10.108 0/14697688.2012.748934. [20] Boguslavsky, M. and Boguslavskaya, E. (2004) ‘Arbitrage under power’, Risk, vol. 17, pp. 69–73. [21] Bucca, A. and Cummins, M. (2011) ‘Synthetic floating crude oil storage and optimal statistical arbitrage: A model specification analysis’, Working Paper, Dublin City University Business School. [22] Burks, N., Li, B., Bowling, J. and Schmid, N. (2018) ‘Capitalizing Off Col- lapse: The Hidden Alpha During the Crisis Periods’, International Research Journal of Applied Finance, vol. 9, pp. 124–133. [23] Cartea, Á. and Jaimungal, S. (2016) ‘Algorithmic Trading of Co-Integrated Assets’, International Journal of Theoretical and Applied Finance, vol. 19, pp. 1–18. [24] Chanol, E., Collet, O., Kostyuchyk, N., Mesbah, T., Nguyen and Q.H.L. (2015) ‘Co-integration for Soft Commodities with Non Constant Volatility’, International Journal of Trade, Economics and Finance, vol. 6, pp. 32–36. [25] Charalambous, K., Sophocleous, C., O’Hara, J.G. and Leach, P.G. (2015) ‘A deductive approach to the solution of the problem of optimal pairs trading from the viewpoint of stochastic control with time-dependent parameters’, Mathematical Methods in the Applied Sciences, vol. 38, pp. 4448–4460. [26] Chiu, M.C. and Wong, H.Y. (2011) ‘Mean-variance portfolio selection of cointegrated assets’, Journal of Economic Dynamics and Control, vol. 35, pp. 1369–1385. [27] Chiu, M.C. and Wong, H.Y. (2015) ‘Dynamic cointegrated pairs trading: Mean–variance time-consistent strategies’, Journal of Computational and Applied Mathematics, vol. 290, pp. 516–534, doi:10.1016/j.cam.2015.06.004. [28] Chiu, M.C. and Wong, H.Y. (2018) ‘Robust dynamic pairs trading with coin- tegration’, Operations Research Letters, vol. 46, pp. 225–232. [29] Cont, R. (2001) ‘Empirical properties of asset returns: stylized facts and statistical issues’, Quantitative Finance, vol. 1, pp. 223–236. [30] Cummins, M. (2010) ‘Optimal statistical arbitrage: A model specification analysis on ISEQ equity data’, Irish Accounting Review, vol. 17, pp. 21–40. [31] Cummins, M. and Bucca, A. (2012) ‘Quantitative Spread Trading on Crude Oil and Refined Products Markets’,Quantitative Finance, vol. 12, pp. 1857–1875. [32] D’Aspremont, A. (2011) ‘Identifying Small Mean Reverting Portfolios’, Quan- titative Finance, vol. 11, pp. 351–364, doi:10.1080/14697688.2010.481634. [33] Diamond, R. (2014) ‘Learning and trusting cointegration in statistical arbi- trage’, Working Paper, CQF Institute, New York.

112 Review of stochastic differential equations in statistical arbitrage pairs trading

[34] Do, B., Faff, R. and Hamza, K. (2006) ‘A new approach to modeling and estimation for pairs trading’, in Proceedings of 2006 Financial Management Association European Conference, Stockholm, Sweden. [35] Dourban, A. and Yedidsion, L. (2015) ‘Threshold function for the optimal stopping of arithmetic Ornstein–Uhlenbeck process’, in Proceedings of the 2015 International Conference on Operations Excellence and Service Engi- neering, pp. 1–7. [36] Dunis, C.L., Giorgioni, G., Laws, J. and Rudy, J. (2010) ‘Statistical arbitrage and high-frequency data with an application to Eurostoxx 50 equities’, Work- ing Paper, Liverpool Business School. [37] Einstein, A. (1905) ‘Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen’, Annalen der Physik, vol. 322, pp. 549–560. [38] Ekström, E., Lindberg, C. and Tysk, J. (2011) ‘Optimal liquidation of a pairs trade’, in Di Nunno, G., Øksendal, B. (eds.), Advanced Mathematical Meth- ods for Finance, Springer, Berlin, Germany and Heidelberg, pp. 247–255. [39] Elliott, R.J., van der Hoek, J. and Malcolm, W.P. (2005) ‘Pairs trading’, Quan- titative Finance, vol. 5, pp. 271–276. [40] Endres, S. and Stübinger, J. (2019a) ‘A flexible regime switching model with pairs trading application to the S&P 500 high-frequency stock returns’, Quantitative Finance, vol. 19, pp. 1727–1740. [41] Endres, S. and Stübinger, J. (2019b) ‘Optimal trading strategies for Lévy-driven Ornstein–Uhlenbeck processes’, Applied Economics, vol. 51, pp. 3153–3169. [42] Fanelli, V. and Lesca, A. (2014) ‘Natural gas statistical arbitrage’, New Frontiers in Practical Risk Management, vol. 61, pp. 61–70. [43] Farkas, W., Gourier, E., Huitema, R. and Necula, C. (2017) ‘A two-factor cointegrated commodity price model with an application to spread option pricing’, Journal of Banking & Finance, vol. 77, pp. 249–268. [44] Figuerola-Ferretti, I., Paraskevopoulos, I. and Tang, T. (2015) ‘Market model for valuing and managing portfolio strategies’, Working Paper, University of Madrid. [45] Figuerola-Ferretti, I., Paraskevopoulos, I. and Tang, T. (2017) ‘A market ap- proach for convergence trades’, Working Paper, University of Madrid. [46] Focardi, S.M., Fabozzi, F.J. and Mitov, I.K. (2016) ‘A new approach to sta- tistical arbitrage: Strategies based on dynamic factor models of prices and their performance’, Journal of Banking & Finance, vol. 65, pp. 134–155, doi:10.1016/j.jbankfin.2015.10.005. [47] Fouque, J.P., Pun, C.S. and Wong, H.Y. (2016) ‘Portfolio Optimization with Ambiguous Correlation and Stochastic Volatilities’, SIAM Journal on Control and Optimization, vol. 54, pp. 2309–2338. [48] Gatev, E., Goetzmann, W.N. and Rouwenhorst, K.G. (1999) ‘Pairs trading: Performance of a relative value arbitrage rule’, Working Paper, Yale School of Management’s International Center for Finance.

113 Sylvia Endres

[49] Gatev, E., Goetzmann, W.N. and Rouwenhorst, K.G. (2006) ‘Pairs Trading: Performance of a Relative-Value Arbitrage Rule’, Review of Financial Stud- ies, vol. 19, pp. 797–827. [50] Göncü, A. (2015) ‘Statistical Arbitrage in the Black–Scholes Framework’, Quantitative Finance, vol. 15, pp. 1489–1499. [51] Göncü, A. and Akyildirim, E. (2014) ‘Statistical Arbitrage in jump-diffusion models with compound Poisson processes’, Working Paper, Liverpool Uni- versity, doi:10.2139/ssrn.2542912. [52] Göncü, A. and Akyildirim, E. (2016a) ‘A stochastic model for commodity pairs trading’, Quantitative Finance, vol. 16, pp. 1843–1857, doi:10.1080/ 14697688.2016.1211793. [53] Göncü, A. and Akyildirim, E. (2016b) ‘Statistical Arbitrage with Pairs Trading’, International Review of Finance, vol. 16, pp. 307–319. [54] Göncü, A. and Akyildirim, E. (2017) ‘Statistical arbitrage in the multi-asset Black–Scholes economy’, Annals of Financial Economics, vol. 12, pp. 1–19. [55] Gregory, I., Ewald, C.O. and Knox, P. (2010) ‘Analytical pairs trading under different assumptions on the spread and ratio dynamics’, in 23rd Austral- asian Finance and Banking Conference, Sydney, doi:10.2139/ssrn.1663703. [56] Gu, J. and Steffensen, M. (2015) ‘Optimal portfolio liquidation and dynamic mean-variance criterion’, Working Paper, The University of Hong Kong. [57] Hahn, W.J., DiLellio, J.A. and Dyer, J.S. (2018) ‘Risk premia in commodity price forecasts and their impact on valuation’, Energy Economics, vol. 72, pp. 393–403. [58] Heston, S.L. (1993) ‘A Closed-Form Solution for Options with Stochastic Volatility with Applications to Bond and Currency Options’, Review of Fi- nancial Studies, vol. 6, pp. 327–343. [59] Jondeau, E., Lahaye, J. and Rockinger, M. (2015) ‘Estimating the price impact of trades in a high-frequency microstructure model with jumps’, Journal of Banking & Finance, vol. 61, pp. 205–224. [60] Jurek, J.W. and Yang, H. (2007) Dynamic portfolio selection in arbitrage, Working Paper, Harvard University, doi:10.2139/ssrn.882536. [61] Kanamura, T., Rachev, S.T. and Fabozzi, F.J. (2010) ‘A profit model for spread trading with an application to energy futures’, The Journal of Trading, vol. 5, pp. 48–62. [62] Kang, J., Leung, T. (2017) ‘Asynchronous ADRs: overnight vs intraday re- turns and trading strategies’, Studies in Economics and Finance, vol. 34, pp. 580–596. [63] Kim, K. (2011) ‘Performance analysis of pairs trading strategy utilizing high frequency data with an application to KOSPI 100 equities’, Working Paper, Harvard University, doi:10.2139/ssrn.1913707. [64] Kim, S.J., Primbs, J. and Boyd, S. (2008) ‘Dynamic spread trading’, Working Paper, Stanford University.

114 Review of stochastic differential equations in statistical arbitrage pairs trading

[65] Kitapbayev, Y. and Leung, T. (2017) ‘Optimal mean-reverting spread trad- ing: nonlinear integral equation approach’, Annals of Finance, vol. 13, pp. 181–203. [66] Kitapbayev, Y. and Leung, T. (2018) ‘Mean Reversion Trading with Sequential Deadlines and Transaction Costs’, International Journal of Theoretical and Applied Finance, vol. 21, pp. 1–22. [67] Krauss, C. (2017) ‘Statistical arbitrage pairs trading strategies: Review and outlook’, Journal of Economic Surveys, vol. 31, pp. 513–545, doi:10.1111/ joes.12153. [68] Kuo, K., Luu, P., Nguyen, D., Perkerson, E., Thompson, K. and Zhang, Q. (2015) ‘Pairs trading: An optimal selling rule’, Mathematical Control and Related Fields, vol. 5, pp. 489–499, doi:10.3934/mcrf.2015.5.489. [69] Larsson, S., Lindberg, C., Warfheimer, M. (2013) ‘Optimal closing of a pair trade with a model containing jumps’, Applications of Mathematics, vol. 58, pp. 249–268. [70] Lei, Y. and Xu, J. (2015) ‘Costly arbitrage through pairs trading’, Jour- nal of Economic Dynamics and Control, vol. 56, pp. 1–19, doi:10.1016/ j.jedc.2015.04.006. [71] Leung, T. and Li, X. (2015) ‘Optimal Mean Reversion Trading with Transac- tion Costs and Stop-Loss Exit’, International Journal of Theoretical and Applied Finance, vol. 18, pp. 1–31. [72] Leung, T. and Li, X. (2016) ‘Optimal mean reversion trading: Mathe- matical analysis and practical applications’, World Scientific, Singapore, doi:10.1142/9839. [73] Li, T.N. and Tourin, A. (2016) ‘Optimal Pairs Trading with Time-Varying Volatility’, International Journal of Financial Engineering, vol. 3, pp. 1–45. [74] Li, X. (2015) Optimal multiple stopping approach to mean reversion trading, Ph.D. thesis, Columbia University, doi:10.7916/D88K781S. [75] Lindberg, C. (2014) ‘Pairs trading with opportunity cost’, Journal of Applied Probability, vol. 51, pp. 282–286. [76] Lintilhac, P.S. and Tourin, A. (2017) ‘Model-based pairs trading in the bitcoin markets’, Quantitative Finance, vol. 17, pp. 703–716. [77] Liu, B., Chang, L.B. and Geman, H. (2017) ‘Intraday pairs trading strategies on high frequency data: the case of oil companies’, Quantitative Finance, vol. 17, pp. 87–100, doi:10.1080/14697688.2016.1184304. [78] Liu, J. and Timmermann, A. (2013) ‘Optimal Convergence Trade Strategies’, Review of Financial Studies, vol. 26, pp. 1048–1086. [79] Meucci, A. (2009) ‘Review of Statistical Arbitrage, Cointegration, and Multi- variate Ornstein–Uhlenbeck’, Working Paper, Symmys, pp. 1–20. [80] Mota, P.P. and Esquivel, M.L. (2016) ‘Model selection for stock prices data’, Journal of Applied Statistics, vol. 43, pp. 2977–2987, doi:10.1080/026647 63.2016.1155205.

115 Sylvia Endres

[81] Mudchanatongsuk, S., Primbs, J.A. and Wong, W. (2008) ‘Optimal Pairs Trading: A Stochastic Control Approach’, in American Control Conference, 2008, pp. 1035–1039. [82] Ngo, M.M. and Pham, H. (2016) ‘Optimal switching for the pairs trading rule: a viscosity solutions approach’, Journal of Mathematical Analysis and Applications, vol. 441, pp. 403–425. [83] Nóbrega, J.P. and Oliveira, A.L. (2013) ‘Improving the Statistical Arbitrage Strategy in Intraday Trading by Combining Extreme Learning Machine and Support Vector Regression with Linear Regression Models’, in IEEE 25th International Conference on Tools with Artificial Intelligence, pp. 182–188. [84] Nobrega, J.P. and Oliveira, A.L. (2014) ‘A combination forecasting model using machine learning and kalman filter for statistical arbitrage’, in IEEE International Conference on Systems, Man and Cybernetics, pp. 1294–1299. [85] Osmekhin, S. and Déleze, F. (2015a) ‘Application of continuous-time random walk to statistical arbitrage’, Journal of Engineering Science and Technology Review, vol. 8, pp. 91–95. [86] Osmekhin, S. and Déleze, F. (2015b) ‘Waiting-time distribution and market efficiency: Evidence from statistical arbitrage’, Working Paper, Hanken School of Economics. [87] Pilipovic, D. (2007) Energy Risk: Valuing and Managing Energy Derivatives. New York: McGraw Hill Professional. [88] Psaradellis, I., Laws, J., Pantelous, A.A. and Sermpinis, G. (2018) ‘Pairs trad- ing, technical analysis and data snooping: Mean reversion vs. momentum’, Working Paper, University of St. Andrews, doi:10.2139/ssrn.3128788. [89] Rampertshammer, S. (2007) ‘An Ornstein–Uhlenbeck framework for pairs trading’, Working Paper, University of Melbourne. [90] Schwartz, E. (1997) ‘The Stochastic Behavior of Commodity Prices: Implications for Valuation and Hedging’, The Journal of Finance, vol. 52, pp. 923–973. [91] Schwartz, E. and Smith, J. (2000) ‘Short-Term Variations and Long-Term Dynamics in Commodity Prices’, Management Science, vol. 46, pp. 893–911. [92] Sharp, K.P. (1990) ‘Stochastic differential equations in finance’. Applied Mathematics and Computation, vol. 37, pp. 131–148. [93] Smoluchowski, M. von (1906) ‘Zur kinetischen Theorie der brownschen Molekularbewegung und der Suspensionen’, Annalen der Physik, vol. 326, pp. 756–780. [94] Song, Q. and Zhang, Q. (2013) ‘An optimal pairs-trading rule’, Automatica, vol. 49, pp. 3007–3014, doi:10.1016/j.automatica.2013.07.012. [95] Stehlıková, B., (2008) Mathematical analysis of term structure models, Ph.D. thesis, Comenius University Bratislava. [96] Stübinger, J. and Endres, S. (2018) ‘Pairs trading with a mean-reverting jump- diffusion model on high-frequency data’, Quantitative Finance, vol. 18, pp. 1735–1751.

116 Review of stochastic differential equations in statistical arbitrage pairs trading

[97] Suzuki, K. (2016) ‘Optimal switching strategy of a mean-reverting asset over multiple regimes’, Automatica, vol. 67, pp. 33–45. [98] Suzuki, K. (2018) ‘Optimal pair-trading strategy over long/short/square positions – empirical study’, Quantitative Finance, vol. 18, pp. 97–119. [99] Temnov, G. (2015) ‘Analysis of Ornstein–Uhlenbeck process stopped at maximum drawdown and application to trading strategies with trailing stops’, Working Paper, Charles University, , pp. 1–19. [100] Thiele, T.N. (1880) Om Anvendelse af mindste Kvadraters Methode i nogle Tilfælde, hvor en Komplikation af visse Slags uensartede tilfældige Fejlkilder giver Fejlene en ‘systematisk’ Karakter. Det Kongelige Danske Viden- skabernes Selskabs Skrifter-Naturvidenskabelig og Mathematisk Afdeling, pp. 381–408. [101] Tie, J., Zhang, H. and Zhang, Q. (2017) ‘An optimal strategy for pairs trad- ing under geometric Brownian motions’, Journal of Optimization Theory and Applications, pp. 1–22. [102] Tourin, A. and Yan, R. (2013) ‘Dynamic pairs trading using the stochastic control approach’, Journal of Economic Dynamics and Control, vol. 37, pp. 1972–1981. [103] Triantafyllopoulos, K. and Montana, G. (2011) ‘Dynamic modeling of mean- reverting spreads for statistical arbitrage’, Computational Management Science, vol. 8, pp. 23–49. [104] Wissner-Gross, A.D. and Freer, C.E. (2010) ‘Relativistic statistical arbitrage’, Physical Review E, vol. 82, pp. 1–7. [105] Wong, E. (1964) ‘The construction of a class of stationary Markoff pro- cesses’, Stochastic Processes in Mathematical Physics and Engineering, vol. 17, pp. 264–276. [106] Yamamoto, R. and Hibiki, N. (2017) ‘Optimal multiple pairs trading strategy using derivative free optimization under actual investment management conditions’, Journal of the Operations Research Society of Japan, vol. 60, pp. 244–261. [107] Yang, J.W., Tsai, S.Y., Shyu, S.D. and Chang, C.C. (2016) ‘Pairs trading: The performance of a stochastic spread model with regime switching-evidence from the S&P 500’, International Review of Economics & Finance, vol. 43, pp. 139–150, doi:10.1016/j.iref.2015.10.036. [108] Yang, Y., Göncü, A. and Pantelous, A. (2017) ‘Pairs Trading with Commod- ity Futures: Evidence from the Chinese Market’, China Finance Review International, vol. 7, pp. 274–294, doi:10.1108/CFRI-09-2016-0109. [109] Yeo, J. and Papanicolaou, G. (2017) ‘Risk control of mean-reversion time in statistical arbitrage’, Risk and Decision Analysis, vol. 6, pp. 263–290. [110] Yoshikawa, D. (2017) ‘An Entropic Approach for Pair Trading’, Entropy, vol. 19, pp. 320.

117 Sylvia Endres

[111] Zeng, Z. and Lee, C.G. (2014) ‘Pairs trading: optimal thresholds and profitability’, Quantitative Finance, vol. 14, pp. 1881–1893, doi:10.1080/14697688.2014. 917806. [112] Zhang, H. and Zhang, Q. (2008) ‘Trading a Mean-Reverting Asset: Buy Low and Sell High’, Automatica, vol. 44, pp. 1511–1518. [113] Zhang, J., Leung, T. and Aravkin, A.Y. (2018) Mean Reverting Portfolios via penalized OU-Likelihood Estimation, Working Paper, University of Wash- ington. [114] Zhao, B. (2009) ‘Mean First Passage Times of the Feller and the GARCH Diffusion Processes’, Working Paper, City University , doi:10.2139/ ssrn.1465334. [115] Zhengqin, Z. (2014) ‘Optimal Pairs Trading: Static and Dynamic Models’, Working Paper, University of Toronto. Managerial Economics 2019, vol. 20, No. 2, pp. 119–132 https://doi.org/10.7494/manage.2019.20.2.119

Jessica Hastenteufel*, Mareike Staub**

Current and future challenges of family businesses

1. Introductory overview

Family businesses are the engine and backbone of many leading economies (i.e. Germany). They make an essential contribution to fostering the competitive- ness and innovation of an economy and make a significant contribution to job creation, which makes them an important job engine (May 2013, 21). The entre- preneurial landscape is characterized by a large number of traditional family-owned companies that can look back on a long success story. Traditions that have been built up over generations and that combine aspects such as trust and reliability play an important role in family businesses. Despite this traditional orientation, family businesses are often characterized by features such as innovativeness, fore- sight, long-term focus and flexibility. However, both the family businesses and the entrepreneurial families themselves have some weaknesses that must be consid- ered. They also have to adapt to changed framework conditions and challenges – among other things due to digitization, internationalization and demographic change – and they have to develop appropriate solutions in order to be able to survive permanently and successfully in the market. Though the high relevance of the family to the company involves opportunities – for example, when, based on the family, a family strategy with clear values, roles and goals is defined. In addition, the development of so-called family governance is of great importance for the success of family businesses and keeping the company in family hands. The aim of this paper is to describe and analyse the current challenges of fam- ily businesses. Therefore, a normative and literature-based research is conducted to show the impact of the digitization on as well as the importance of innova- tion for family businesses. Moreover, we will highlight the topic of management

* IUBH International University of Applied Sciences and Saarland University, e-mail: j.hastenteufel@ iubh-fernstudium.de. ** SIKB, sales management department, e-mail: [email protected].

119 Jessica Hastenteufel, Mareike Staub

succession as a key challenge for family businesses and discuss the various factors that contribute to the success of family-owned businesses.

2. The impact of digitization on family businesses

Digitization is becoming increasingly important and is influencing different areas of the economy and everyday life (BMWi 2017 35/98). The digital transfor- mation is observed everywhere, and therefore family businesses must deal with it (Cravotta et al. 2018, 10). Digitization is leading to a radical revolution in all sectors of the economy (Weissman and Wegerer 2018, 11) and the success and future potential of companies highly depends on its positive implementation (Kreide 2016, 71). Digitization holds great potential for family businesses. This potential must be identified and exploited (Astor et al. 2016, 2). In order to successfully implement digitization in family businesses, this topic should be integrated directly into their corporate strategy or a concrete digitiza- tion strategy should be developed (Weissman and Wegerer 2018, 8). A specially formulated digitization strategy has many advantages, as companies with a digital strategic focus generally have better economic processes and greater connectivity, distinguishing them from companies without such a strategy (Icks et al. 2017, 39). In addition, a digitization strategy allows a more effective control of business activi- ties (Hille et al. 2016, 27). In numerous family businesses, activities to implement the digital transformation are located in the IT department (IfM Bonn 2017, 18). However, it is not advisable to place digitization topics and the development of digitization strategies exclusively in the IT department. Rather, they have to be placed with senior management so that the entire value chain and the potential changes in business models can be continuously managed and monitored (Weiss- man and Wegerer 2018, 7). Even though providing the necessary infrastructure for digital transformation is within the remit of the IT department, it is essential that the IT department considers the needs and requests of all players involved. With this in mind, it is very important for family businesses to understand that digitization is a development that will affect all sectors of their business (Hille et al. 2016, 52). To succeed in digital transformation, family businesses must therefore undergo a cultural change (Bartels et al. 2017, 4). When implementing their digitization activities, they always have to take into account their specific orientation, for which - loyal employees, - long-term planning, - extensive control processes, - a comprehensive culture of errors

120 Current and future challenges of family businesses

are characteristic (Windthorst 2018). It is also important that family businesses stick to their values and goals that are an essential part of their corporate culture. This means that the interests of the entrepreneurial family must be considered in all decisions (Kreide 2016, 70). Digitization poses major challenges for family businesses in particular because it must be brought into line with the specific characteristics of family businesses (Hille et al. 2016, 7). Family-owned businesses typically have a long-term organiza- tion characterized by trust and based on fixed values and goals. In comparison, digitization is characterized by features such as dynamism, innovation, speed, and real time (Bartels et al. 2017, 4). The task of family businesses is therefore to tailor their business models to the changing circumstances and demands of the digital age. Above all, the transition from one generation to the next offers great opportunities to advance digital transformation (Hille et al. 2016, 7), as the older generations make targeted use of the skills of the next generation. Thus, the experience of older entrepreneurs is reconciled with the innovative ideas of their successors and they will benefit from the creative ability and trustworthiness of the younger generation (Bartels et al. 2017, 12).

3. The importance of innovation in family businesses

With regard to the particularities of family businesses, it is important to in- vestigate to what extent these characteristics influence their innovation activity (Halder 2016, 65). A key factor involved and a significant differentiator to non- family businesses at the same time, is the unity of ownership and management (Werner et al. 2013, 1). This structure allows the resources needed for innovation to be made available by the family members, who are often senior managers at the same time. In this way, they can drive innovations in the company (Bauer 2013, 129). Above all, the uniqueness of resources in family businesses – which can be traced back to the interaction of the family, the company and the indi- viduals – is a significant advantage (Heider 2017, 58). Furthermore, due to the long-term orientation of family businesses, it is reasonable to assume that family entrepreneurs are willing to invest heavily in innovation (Kammerlander and Prügl 2016, 7). However, in practice, family entrepreneurs are often character- ized by a particular thriftiness, reflected in the fact that spending on resources is moderate, to ensure that companies are not exposed to excessive risk (Bauer 2013, 132). Even though investments in research and development are rather cautious, innovations are still playing an important role in family businesses (Kam- merlander and Prügl 2016, 8). According to a recent study, innovation spending by family businesses is low compared to non-family businesses, but at the same

121 Jessica Hastenteufel, Mareike Staub

time they have a higher propensity to innovate (Duran et al. 2015, 1224). This means that they achieve more innovative outcomes than non-family businesses. This shows in their share of sales resulting from innovations (Kammerlander and Prügl 2016, 8). In addition, the long-term orientation of family businesses favours the development of expertise and its permanent manifestation within a company, which in turn has a positive effect on the realization of innovative measures (Werner et al. 2013, 32). Current studies show that the long-term focus and the striving for success of family businesses have a positive effect on their entrepreneurial orientation as well as their ability to innovate (Heider 2017, 216). Furthermore, the corporate culture, which is fundamentally determined by the family, influences the ability of family businesses to innovate (Kammerlander and Prügl 2016, 26). Above all, the sense of togetherness between employees due to the cultural characteristics has a positive effect on the successful realization of innovation projects (Bauer 2013, 132). Moreover, one key factor influencing the innovative capacity of family businesses is company size (Binder 2014, 94). In general, the larger the family business, the smaller the family influence. Therefore, it can be assumed that with increasing business size, the innovation activities decline. The cause of this tendency lies within the shorter decision-making and the more informal organisational structure in smaller companies (Werner et al. 2013, 32). As a result, smaller family businesses tend to be more innovative than smaller non-family businesses and large family businesses are often less innova- tive than large non-family businesses (Heider 2017, 127). In addition, regional orientation is another factor influencing the innova- tive behaviour of family businesses (Heider 2017, 129). The intensive anchoring in the region plays an important role especially if the companies have already been based in the same location for a long time. Often, longstanding individual relationships with stakeholders have been built up and the resulting expertise and information are a helpful support in innovation development (Werner et al. 2013, 32). In general, family enterprises rather than non-family businesses tend to integrate existing clients and markets at an early stage in the innovation process (Binder 2014, 94). However, they should also look for other markets where they are currently not active, but which open up sales opportunities for their existing technologies and they should bring their already existing technological knowledge into these markets (Hastenteufel and Staub 2019, 136). A key challenge facing family-owned companies is that they can successfully survive in the field of tension between innovation and tradition (Felden and Marwede 2018, 60). Family businesses are characterized by a strong sense of tradi- tion and deeply rooted values that are preserved and shared across generations. However, there is always a risk that this traditional bias may inhibit innovation behaviour. Family businesses that want to be successful in the long term must

122 Current and future challenges of family businesses

be able to master the balancing act between traditional orientation and innova- tive development by maintaining the traditions necessary for their survival and responding flexibly to change (Hastenteufel and Staub 2019, 136).

4. Business succession as a major challenge for family businesses

Management succession poses major challenges for family businesses. There- fore, they have to start early with any succession plan in order to plan it com- prehensively (Kenyon-Rouvinez 2012, 102). One of the biggest challenges of the succession process is that, in many cases, the former owners do not prepare for their succession in time and cannot find an adequate successor for their business. Another challenge is the strong emotional attachment of many lenders to their business, which makes letting go of their lifework difficult. Among the possible successors, the biggest problems are - financing difficulties, - lacking professional qualifications and - the lack of suitable companies to take over (Evers 2017, 8). Furthermore, in the context of company succession, psychological aspects are also very important. A profound emotional connection between the individual actors involved in the succession process leads to obstacles in the succession (Schlippe 2012, 170). For the former owner, the transfer of his business to a certain extent also means the passing on of his life’s work, for which he has worked hard over decades (Wiesehahn 2015, 32). Many old owners worry about the future and the existence of the company. Therefore, the entrepreneurs often have problems to free themselves from what they have built up with much effort and passion. Par- ticularly in the case of family-internal successions, conflicts often occur within the family or between the different generations. These arise mostly when the interests of the individual family members are different, and several goals are pursued. Such differences often prevent a successful business succession (Hastenteufel and Staub 2019, 138). In addition to family disputes, many other reasons – e. g. inadequate preparation for succession, lack of capacity for potential successors, or lack of de- termination on the part of the owner to succeed – are responsible for the failure of a family succession (Steinbrecher and Hyde 2016, 32). Moreover, the personal relationship between the entrepreneur and the suc- cessor plays an important role in the successful creation of a management succes- sion. It is often the case that potential successors do not meet the requirements of the family business owner (Spelsberg 2011, 16). The successful realisation of a company succession therefore highly depends on the person of the successor.

123 Jessica Hastenteufel, Mareike Staub

Cabrera-Suárez et al. (2001), for example, consider that the quality of the re- lationship between senior and junior is crucial for the successor to be able to develop his or her abilities and potentials freely under the given circumstances. The former owner must trust in the skills and competences of his successor and he has to believe that he will succeed in continuing the business (Pirmanschegg 2016, 160). At the same time, however, it also requires the will and determination of the junior to face the challenges of company succession (Schlippe 2012, 170). In this context, it has to be borne in mind that the potential successor must have sufficient entrepreneurial qualifications in order to be able to succeed (Brach 2015, 223). In general, the successor has to be systematically familiarized with their upcoming role early in order to become familiar with their future field of activity and thus to be confident of his potential and abilities for the new position. During this transitional period, the successor can already gain the recognition of the existing workforce by, for example, working in different technical departments (Nagl 2015, 26). In addition, if there is a succession within the family, it must be checked whether a change in the partnership agreement and/or a restructuring of the company are required (Hastenteufel and Staub 2019, 138). Furthermore, the entrepreneurial families have to deal with financial and tax issues in this context and there is a need to clarify numerous legal matters. In order to avoid disputes in this regard, the use of a mediator can make sense (Kempert 2008, 61). In many cases, family entrepreneurs do postpone their succession planning until it is almost too late. In the case of unforeseeable events, such as sudden death, acute illness or an accident of the owner, the company succession is made considerably more difficult. In order to prevent the resulting consequences, the development of an emergency plan that is geared towards such situations is highly recommended (Papesch 2010, 71). A corresponding plan should contain provi- sions that describe measures in the event that the entrepreneur can no longer manage the company himself (Felden et al. 2018, 62). At the same time, it also serves to be prepared for the situation of failure of the planned successor. With the help of an emergency plan, the continuation as well as the ability to act of a company is guaranteed (Hastenteufel and Staub 2019, 138).

5. Family business governance as a possible success factor

Because of their specific characteristics, family businesses require different governance structures than non-family-owned companies (Felden and Marwede 2018, 62). This can be explained by the fact that family businesses generally have

124 Current and future challenges of family businesses

a management structure consisting of family members with a high proportional stake, whereas non-family companies are mostly managed by external managers who are not necessarily shareholders (Welge and Eulerich 2014, 196). Therefore, to ensure the long-term success of a family-owned enterprise, both the entrepre- neurial family’s and the company’s individual needs must be anchored within the company’s governance framework. This is a clear differentiator to the governance of non-family-owned companies (Felden and Hack 2014, 282). Above all, in or- der to do justice to the family goals and to create transparent structures, family businesses need a corresponding regulatory framework for the management and supervision of the company. In this context, some family businesses have implemented an extension of traditional corporate governance. It is referred to as family business governance (Lueger and Froschauer 2015, 67). Family business governance consists of business governance and family governance. It offers family businesses the opportunity to stabilize their future orientation across generations (Koeberle-Schmid et al. 2010, 161). Moreover, it is an instrument setting the specific framework for both the family and the company. In general, a family business governance system aims to achieve the best possible governance by seeking to improve its economic and emotional value (Koeberle- Schmid et al. 2012, 29). This requires the establishment of appropriate bodies and mechanisms. Its operational activities should ensure cohesion within the family of entrepreneurs and at the same time create rules for the management and the control of the family business (Koeberle-Schmid and Koners 2018, 22). Business governance in family businesses lies within the responsibility of the executive board and the supervisory board and is supported by typical business governance instruments such as - risk management, - internal audit, - compliance management and - an internal control system (Hastenteufel and Staub 2019, 186). At this point, the main task of business governance is to carry out manage- ment and monitoring measures (Süss-Reyes 2015, 14). Family governance as a further component of family business governance is based on the specific family circumstances and must be adapted and developed individually to the framework conditions of each individual family business (Breyer 2015, 152). The goal of family governance is the creation of fair, transparent, verifiable rules for the family and their access to the company (Koeberle-Schmid et al. 2012, 29). Therefore, a set of instruments provides guidelines for the man- agement of the family. In general, family governance promotes unity within the entrepreneurial family and the identification of family members with the com- pany (Hastenteufel and Staub 2019, 186). In order to prevent family conflicts,

125 Jessica Hastenteufel, Mareike Staub

preserving traditions, transferring them to the next generations and strengthening emotional values in the family, the development of family governance structures in the family business is a great instrument (Canessa et al. 2016, 99). The task of family governance is, among other things, that the family members explicitly commit to the company and that clear rules for communication within the owner family are formulated (Hastenteufel and Staub 2019, 186). In practice, family business governance can be implemented by means of various measures, and the design of which varies according to the individual characteristics of each family business. The extent to which family governance instruments are used in a company is fundamentally determined by the complexity of family structures and the relationship between the family and the company. In addition, environmental factors that affect the family business, the concept of man of the entrepreneurial family and the values they live are important (Hastenteufel and Staub 2019, 186). Key family governance tools in family businesses include: - developing a family constitution, - implementing family education, - establishing family activities such as family reunions, - setting up family offices (Welge and Eulerich 2014, 202).

6. Special conflict management in family businesses

Conflicts occurring in family businesses are very different (Neuvians 2011, 62). Problems usually arise, for example, from the fact that companies do not imple- ment solid solutions to solve potential disputes (Caspary 2018, 29). On the one hand, the potential conflicts of family businesses are (sometimes) very complex and risky due to their structural characteristics (integration of the systems “family”, “property” and “company”). On the other hand, from this interdependence do also derive many opportunities (Hastenteufel and Staub 2019, 189). As a peculiarity of family businesses, it can therefore be argued that they often require separate con- flict resolution strategies for both the company and the family (Schlippe 2009, 26). Above all, due to the close integration of the family and entrepreneurial sphere, family businesses are highly susceptible to conflict, as a result of which the necessity of integrating active and preventive conflict management into the general management of a company becomes clear (Henseler 2006, 180).

7. Financial stability of family businesses

Securing financial stability is also an important premise for family businesses to maintain their independence. A large number of studies show that family

126 Current and future challenges of family businesses

businesses do generally have high equity ratios, because entrepreneurial families usually seek to be as flexible and liquid as possible to be able to act (Schraml 2010, 268). In addition, it can be seen that the willingness of family businesses to react in times of economic difficulties was usually better than that of non-family businesses (Hastenteufel and Staub 2019, 189). In this context, it should be noted that family-owned companies have a high equity ratio compared to non-family- owned companies. This is explained in particular by the fact that the families regularly invest private capital in the company, as they want to be less dependent on external creditors (Ampenberger et al. 2011, 15). In general, family businesses use internal sources of finance as a primary financing instrument, which is illustrated by a partial reluctance to take on debt (Achleitner et al. 2011, 11). One explanation for this is an increased risk aversion of family businesses (Ampenberger et al. 2011, 15). Furthermore, these trends can also be traced back to economically difficult times in which there are restrictions on access to debt capital. In particular, companies that are successful in the mar- ket have a low number of bank loans with at the same time high shares of equity (May 2013, 30). It is striking that family businesses tend to limit themselves to a number of different types of financing, so that their financing situation is man- ageable and better controllable, so that they are more independent (Hastenteufel and Staub 2019, 190).

8. Concluding remarks

The paper shows that, especially in times of globalization, digitization and increased innovation requirements, it is essential for companies to take on the tasks involved in order to be competitive in the long term and to be successful in the market. The specific challenges faced by family businesses are to position their business models in a field of conflict between a generally strong sense of tradition on the one hand and growing innovative developments on the other hand, so that their values are preserved and at the same time remain future- oriented. Besides, the leadership of the company by family members who tend to act in appropriate positions over a longer period plays an important role and is a major factor in securing long-term success. However, family businesses are increasingly struggling with the requirements of careful business succession planning. Thus, it is highly relevant to implement appropriate preventative measures such as emergency manuals. In this context, it is evident that family-owned companies are, in principle, aware of the shortage of skilled workers and are paying more attention to follow-up issues, although they do often require a more intensive examination of this topic.

127 Jessica Hastenteufel, Mareike Staub

The success criteria of family businesses are based on the valuable resource “family”, their unique corporate culture as well as the firmly anchored values in the company and the family. In recent years, owner families have visibly under- stood that these traditions and values must be manifested across generations through the co-development of adequate tools. Therefore, family business gover- nance mechanisms, which bring a multitude of advantages and can be designed in many different ways, must be implemented. In particular, the establishment of a family constitution, which is the foundation of the entire family business, is very important for the owner family. Companies must also consider the need for continuous adaptation and development to changing environmental conditions otherwise their future is at risk. The implementation of a well-functioning conflict management is particularly important in the context of the increased risk of conflicts in family businesses. In doing so, they are resistant to any disputes and crises and can mitigate potential dangers. Furthermore, solid financing structures are also an important success factor of family businesses. Typically, they have a strong and exceptional awareness of how to maintain their financial stability, which is reflected in their preference for less risky financing strategies. This has proven to be very helpful, especially in economic difficult times.

References [1] Achleitner, A.K., Kaserer, C., Günther, N. and Volk, S. (2011) Kapitalmarkt- fähigkeit von Familienunternehmen, [Online], Available: https://www.pwc. de/de/mittelstand/ assets/sfu_studie_kapitalmarkt_e4_za.pdf [31 Oct 2019]. [2] Ampenberger, M., Schmid, T., Achleitner, A.K. and Kaserer, C. (2011) Capital Structure Decisions in Family Firms – Empirical Evidence from a Bank-Based Economy, [Online], Available: https://papers.ssrn.com/sol3/papers.cfm?ab- stract_id=1364153 [31 Oct 2019]. [3] Astor, M., Rammer, C., Klaus, C. and Klose, G. (2016) Innovativer Mittel- stand 2025 – Herausforderungen, Trends und Handlungsempfehlungen für Wirtschaft und Politik, [Online], Available: http://ftp.zew.de/pub/zew-docs/ gutachten/InnovativerMittelstand2025ZEWPrognos2016.pdf [31 Oct 2019]. [4] Bartels, P., Hochberg, and Müller, C. (2017) Strategien erfolgreicher Familien­ unternehmen: Konservative Innonatoren, [Online], Available: https://www. pwc.de/de/ mittelstand/assets/strategien-erfolgreicher-familienunternehmen. pdf [31 Oct 2019]. [5] Bauer, T. (2013) Innovationen in Familienunternehmen: Eine empirische Untersuchung, Wiesbaden: Springer Gabler. [6] Binder, A. (2014) Innovation in erfolgreichen Familienunternehmen: Unter- suchung der frühen Phase von Innovationen in der chemischen Industrie, Wiesbaden: Springer Gabler.

128 Current and future challenges of family businesses

[7] Brass, T. (2015) Organisation des Nachfolgeprozesses, in: Unternehmens- nachfolge: Praxishandbuch für Familienunternehmen, Wiesbaden: Springer Gabler. [8] Breyer, M. (2015) ‘Family Governance – Die Organisation der Familie’, Zeit- schrift für Familienunternehmen und Strategie, vol. 4, pp. 151–154. [9] BMWi (2017) Weissbuch – Digitale Plattformen, [Online], Available: https:// www. bmwi.de/Redaktion/DE/Publikationen/Digitale-Welt/weissbuch-digitale- plattformen.pdf ?__blob=publicationFile&v=22 [31 Oct 2019]. [10] Cabrera-Suárez, K., De Saá-Pérez P., and García-Almeida, D. (2001) ‘The Suc- cession Process from a Resource and Knowledge-Based View of The Family Firm’, Family Business Review, vol. 14(1), pp. 37–48. [11] Canessa, B. and Weber, C. (2016) ‘Vermögensverwalter und Unterstützer’, in Canessa, B. et al. (ed.) Das Family Office – Ein Praxisleitfaden, Wiesbaden: Springer Gabler, pp. 3–39. [12] Caspary, S. (2018) Das Familienunternehmen als Sozialisationskontext für Unternehmerkinder, Wiesbaden: Springer Gabler. [13] Cravotta, S., Grottke, M. and König, A. (2018) ‘Neue Geschäftsmodelle für Familienunternehmen: Identifikation, Evaluation und Optimierung mithilfe der Ansätze Lean Startup und Design Thinking’, Zeitschrift für Familien- unternehmen und Strategie, vol. 1, pp. 10–14. [14] Duran, P., Kammerlander, N., van Essen, M. and Zellweger, T. (2015) ‘Doing More with Less: Innovation Input and Output in Family Firms’, Academy of Management Journal, vol. 59(4), pp. 1224–1264. [15] Evers, M. (2017) Unternehmensnachfolge – die Herausforderung wächst, [Online], Available: http://www.vdb-info.de/media/file/18921.DIHK-Report_ Unternehmensnach folge_2017.pdf [31 Oct 2019]. [16] Felden, B., Graffius, M. and Soost, C. (2018) ‘Nachfolge in West- und Ost- deutschland’, Zeitschrift für Familienunternehmen und Strategie, vol. 2, pp. 58–62. [17] Felden, B. and Hack, A. (2014) Management von Familienunternehmen, Wiesbaden: Springer Gabler. [18] Felden, B. and Marwede, L. (2018) ‘Family Business Governance – Wie Fa- milienunternehmen langfristig verantwortungsvolles Handeln sicherstellen’, in: Altenburger, R. and Schmidpeter, R. (ed.), CSR und Familienunterneh- men: Gesellschaftliche Verantwortung im Spannungsfeld von Tradition und Innovation, Wiesbaden: Springer Gabler, pp. 59–68. [19] Halder, A. (2016) Innovationsfähigkeit und Entrepreneurial Orientation in Familienunternehmen: Der Familieneinfluss und die Rolle des Familien- unternehmers, Wiesbaden: Springer Gabler. [20] Hastenteufel, J. and Staub, M. (2019) ‘Aktuelle und zukünftige Herausfor- derungen von deutschen Familienunternehmen’, Der Steuerberater, vol. 5, pp. 134–139.

129 Jessica Hastenteufel, Mareike Staub

[21] Hastenteufel, J. and Staub, M. (2019) ‘Family Business Governance, Kon- fliktmanagement und finanzielle Stabilität als wesentliche Erfolgsfaktoren deutscher Familienunternehmen!?’, Der Steuerberater, vol. 6, pp. 185–190. [22] Heider, A. (2017) Unternehmenskultur und Innovationserfolg in Familien- unternehmen, Wiesbaden: Springer Gabler. [23] Henseler, N. (2006) Beiräte in Familienunternehmen: Eine kritische Betrach- tung der Ausgestaltung der Beiratsarbeit, [Online], Available: https://d-nb. info/981806953/34 [31 Oct 2019]. [24] Hille, M., Janata, S. and Michel, J. (2017) Familienunternehmen im digitalen Wandel – Handlungsfelder und Strategien zwischen Tradition & Disruption, [Online], Available: https://digitales-wirtschaftswunder.de/wp-content/up- loads/2016/07/Studie-Crisp-Resea rch_QSC_Familienunternehmen-im-digi- talen-Wandel.pdf [31 Oct 2019]. [25] Icks, A., Schröder, C., Brink, S., Dienes, C. and Schneck, S. (2017) Di- gitalisierungsprozesse von KMU im Verarbeitenden Gewerbe, [Online], Available: https://www.ifm-bonn.org//uploads/tx_ifmstudies/IfM-Materiali- en-255_2017_02.pdf [31 Oct 2019]. [26] IfM Bonn (2017) Die größten Familienunternehmen in Deutschland, [On- line], Available: https://www.ifm-bonn.org//uploads/tx_ifmstudies/Chart- book_2017.pdf [31 Oct 2019]. [27] Kammerlander, N. and Prügl, R. (2016) Innovation in Familienunternehmen: Eine Einführung für Akademiker und Praktiker, Wiesbaden: Springer Gabler. [28] Kempert, W. (2008) Praxishandbuch für die Nachfolge im Familienunter­ nehmen: Leitfaden für Unternehmer und Nachfolger. Mit Fallbeispielen und Checklisten, Wiesbaden: Springer Gabler. [29] Kenyon-Rouvinez, D. (2012) ‘Familienunternehmen richtig führen’, in Koeberle-Schmid, A. et al. (ed.), Family Business Governance: Erfolgreiche Führung in Familienunternehmen, 2nd ed., Berlin: ESV, pp. 91–107. [30] Koeberle-Schmid, A. and Koners, U. (2018) ‘Das Management der Familie’, Zeitschrift für Familienunternehmen und Strategie, vol. 1, pp. 21–25. [31] Koeberle-Schmid, A., Witt, P., and Fahrion, H. (2010) ‘Gestaltung der Go- vernance im Familienunternehmen: Gremien und Instrumente der Business und Family Governance’, Zeitschrift für Corporate Governance, vol. 4, pp. 161–169. [32] Koeberle-Schmid, A., Witt, P., and Fahrion, H. (2012) ‘Family Business Gover- nance als Erfolgsfaktor von Familienunternehmen’, in Koeberle-Schmid, A. et al. (ed.), Family Business Governance. Erfolgreiche Führung in Familien- unternehmen, 2nd ed., Berlin: ESV, pp. 26–44. [33] Kreide, R. (2016) ‘Digitalisierung: Erfolgsfaktor Familienunternehmen’, Unternehmeredition, vol. 3, pp. 70–71.

130 Current and future challenges of family businesses

[34] Lueger, M. and Froschauer, U. (2015) ‘Mediation und externe Beratung: Strategien der Konfliktbearbeitung in Familienunternehmen’, in Lueger, M. and Frank, H. (ed.), Zukunftssicherung für Familienunternehmen. Good Practice Fallanalysen zur Family Governance, Wien: Facultas, pp. 33–69. [35] May, P., (2013) ‘Erfolgsmodell Familienunternehmen – Vorbilder für Deutsch- land’, in May, P. and Förster, N. (Hg.), Vorbilder für Deutschland: Die erfolg- reiche Welt der Familienunternehmer, Bonn/Hamburg: Murmann, pp. 18–57. [36] Nagl, A. (2015) ‘Was muss ich bei der Regelung der Nachfolge bedenken?’, in Nagl, A. (ed.), Wie regele ich meine Nachfolge?, 2nd ed., Wiesbaden: Springer Gabler, pp. 19–35. [37] Neuvians, N. (2011) Mediation in Familienunternehmen: Chancen und Gren- zen des Verfahrens in der Konfliktdynamik, Wiesbaden: Springer Gabler. [38] Papesch, M. (2010) Corporate Governance in Familienunternehmen: Eine Analyse zur Sicherung der Unternehmensnachfolge, Wiesbaden: Springer Gabler. [39] Pirmanschegg, P., (2016) Die Nachfolge in Familienunternehmen, Wiesbaden: Springer Gabler. [40] Schlippe, Arist v. (2009) ‘Schwierige Schwingungen: Konflikte in Familien- unternehmen’, Unternehmermagazin, vol. 1, pp. 26–27. [41] Schlippe, Arist v. (2012) ‘Psychologische Aspekte der Unternehmensnachfol- ge’, Zeitschrift für Corporate Governance, vol. 5, pp. 170–175. [42] Schraml, S. (2010) Finanzierung von Familienunternehmen: Eine Analyse spezifischer Determinanten des Entscheidungsverhaltens, Wiesbaden: Sprin- ger Gabler. [43] Spelsberg, H. (2011) Die Erfolgsfaktoren familieninterner Unternehmens- nachfolgen: Eine empirische Untersuchung anhand deutscher Familienunter- nehmen, Wiesbaden: Springer Gabler. [44] Steinbrecher, H. and Hyde, S. (2016) Great expectations: The next generation of family business leaders, [Online], Available: https://www.pwc.com/gx/en/ family-business-services/publications/assets/next-gen-report.pdf [31 Oct 2019]. [45] Süss-Reyes, J. (2015) ‘Family Governance: Sicherung der Zukunftsfähigkeit von Familienunternehmen’, in Lueger, M. and Frank, H. (ed.), Zukunftssi- cherung für Familienunternehmen – Good Practice Fallanalysen zur Family Governance, Wien: Facultas, pp. 13–31. [46] Weissman, A. and Wegerer, S. (2018) Digitaler Wandel in Familienunter­ nehmen: Das Handbuch, Frankfurt am Main: Campus. [47] Welge, M. and Eulerich, M. (2014) Corporate-Governance-Management: Theorie und Praxis der guten Unternehmensführung, 2nd ed., Wiesbaden: Springer Gabler.

131 Jessica Hastenteufel, Mareike Staub

[48] Werner, A., Schröder, C. and Mohr, B. (2013) Innovationstätigkeit von Fa- milienunternehmen, [Online], Available: https://www.ifm-bonn.org/uploads/ tx_ifmstudies/IfM-Materialien-225_01.pdf [31 Oct 2019]. [49] Wiesehahn, A. (2015) ‘Empirische Studie: Erwartungen an die Unterneh- mensnachfolge’, in Wegmann, J. and Wiesehahn, A. (ed.), Unternehmens- nachfolge: Praxishandbuch für Familienunternehmen, Wiesbaden: Springer Gabler, pp.15–45. [50] Windthorst, K. (2018) Digitalisierung – eine besondere Herausforderung für Familienunternehmen, [Online], Available: http://www.familienunternehmen. de/de/die-unternehmer-meinung/prof-dr-kay-windthorst [31 Oct 2019]. Managerial Economics 2019, vol. 20, No. 2, pp. 133–150 https://doi.org/10.7494/manage.2019.20.2.133

Katarzyna Kowalska-Jarnot*

Selected methods of studying a college’s image

1. Introduction

The contemporary education market in Poland is shaped, among others, by two important phenomena. The first of them is demographic decline, resulting in fewer and fewer students. The other is Poland’s EU membership, which has provided Polish high school graduates with a possibility to study at European col- leges. As a result, most Polish colleges have more places available than candidates, and potential students can choose which college they want to study at. In such a situation, image and reputation are becoming increasingly significant, and next to price they appear to be the main factor influencing the decision. As shown in Dowling (2001) strong, positive image provides a number of tangible benefits to a college including appropriate positioning on buyers’ perception map, forming strong bonds with students and employees and building a relationship with the environment based on trust and credibility. It significantly reduces the risk for the service buyer, since a recognizable image is a kind of a guarantee of the right choice of studies at a particular college. Consequently, colleges are taking action to create a desired image and on this basis shaping their strategies for functioning and development. It is a multifaceted process because image creation affects the quality of an entire company. There are close ties between intangible assets. Image or reputation do not exist in isolation from the institution with which they are associated. In fact, they constitute the last link in the process, which begins with the definition of the organizational values and philosophy, a properly formulated mission, the strategy derived from it, leadership activity and appropriate actions in the area of human capital. Therefore, they result from managing a college in a way that is consistent with its goals and identity and are eventually reflected in the recipients’ perception. As Black (2005) says, identity can be defined as

* College of Economic and Computer Sciences, Cracow, e-mail: [email protected].

133 Katarzyna Kowalska-Jarnot

a composition of various factors derived from management style, business, his- tory, philosophy, strategy, employee behaviour, quality control and culture. In other words, the selection of the discriminants is not limited. The only criterion is their purposefulness and contribution to achieving the ultimate goals of the identity strategy, which are image, credibility and reputation. In this approach, the image is a projection of identity among its target groups or a mental representa- tion of the organization in the minds of recipients (Altkorn 2004). Reputation, in turn, is a qualified form of image. It has the characteristics of a certain durability, higher than in the case of an image. The buyer will expect an organization with a particular reputation to act in a predictable manner, in accordance with a clearly established imperative. Therefore, the condition for having a positive reputa- tion is reliability and the same unchanging standard of functioning. The effect of good reputation is the credibility of the organization among the recipients. One of the elements which distinguishes reputation from image is the time of creation. According to Zarębska (2006) reputation is thus “an assessment of the consistency in the communication of an organization’s qualities in the long term.” Due to this, the building blocks of a college’s identity and its image consist of its key distinguishing features such as: educational offer, quality of the educational service, organizational culture, quality of communication, international coopera- tion, relations with employees, visual identity or individual sets of competences such as student service, lecturers’ competences and others. Effective influence of a college on how it is perceived by its stakeholders requires systematic image research. Given the heterogeneous structure of image, this research should be comprehensive and involve the study of individual distinguishing features. The purpose of this paper is to propose selected research methods that may be ap- plicable in this field.

2. Objectives and areas of image research

Image research achieves some of principal objectives of a college. These include, among others: – identifying the current image, – degree of mastery of the college’s own competences in the assessment of various stakeholder groups, – comparing the college’s image with the images of other colleges, – determining the dynamics of the college’s own competitive position against the background of competitors’ position, – creation of premises for activities improving the company’s operations. Below selected methods and research techniques are proposed to achieve the declared objectives (Tab. 1)

134 Selected methods of studying a college’s image

Table 1 Selected research methods and techniques used to achieve the specific goals of college image research

Objective of the image research Selected research methods and techniques Qualitative research (FGI, IDI) Projection methods Chinese portrait Collage method Brandsights core Identification of the current PCD technique (perceptual circle diagram) image of the college Survey research Knowledge and kindness research Trommsdorf model Fishbein model Semantic profiling method Model I–D–U (importance–delivery–uniqueness) Qualitative research Servqual analysis The degree of own competence Mystery shopping in the assessment of various CIT – critical incident technique stakeholders Importance–Performance Analysis (i.a. IPA) Job satisfaction surveying methods (i.a. JCI) Comparing the college’s image Perception maps with the images of other colleges Semantic profiling method Determining the dynamics of Service position matrix compared to the the college’s own competitive competition position against the background Close and differential positioning of competitors’ position Creation of premises for activities improving the company’s Evaluation surveys (including post-training) operations Research on the effectiveness Media monitoring (clipping) of the college’s PR activities Online image research

Source: author’s own research

3. Qualitative research

At the image diagnosing stage, it is common to use the methods that collect information on attitudes towards the college, knowledge about it, associations related to it, recognition of its trademark and awareness of its services. These

135 Katarzyna Kowalska-Jarnot

tests are carried out using both qualitative and quantitative methods. Qualitative research, focus group interviews in particular, helps to capture those aspects of the image that relate to emotions and attitudes. The form of the interview supports the associations research, during which questions such as “What do you associate the name of the college with?” or “Is the name suitable?” are asked. A variation of it is the test known as the Chinese portrait. The respondents are asked to describe the college as a completely different entity (e.g. as an animal, plant, car, etc.). This method employs the psychological transfer technique. The collage method is based on a team creation of an artistic composition and arranging pictures cut out from newspapers into an image which is then glued onto a large sheet of paper. The obtained piece allows us to capture associations that tend to escape during too careful intellectual processing. Similarly, in the brandsights core survey the respondents choose the image (among many others) that best reflects the college. The PCD technique (perceptual circle diagram) allows one to identify the structure of the image. The survey begins with an open question: “What factors make up the company’s image?” The respondents receive a drawing of a circle divided into 12 parts and are asked to write down in the individual fields various factors which they are able to identify. Then, they rank individual factors and evaluate them by writing down marks outside the circle (e.g. ++, +, 0, -, -). This makes it possible to distinguish the identity characteristics which are the most visible to the respondents and the emotional assessment related to them (Budzyński 2002). It should be emphasized that qualitative techniques also pose some risks. The mere fact of the question being asked and the manner in which the ques- tions are formed, as well as the circumstances in which they are asked, are not indifferent to the current opinions of the respondents. Even small differences in the way questions are formulated result in significant differences. Two identical answers provided to the same question do not mean that the attitude / support for the organization is identical for both of the respondents. As shown in Wójcik (2013) one may be real, the other may be an expression of striving for conformity

4. Quantitative research

Relatively simple methods of studying the image of a college include assess- ing the level of knowledge and attitude (favourability / friendliness) towards it using a 5-point Likert scale (Tab. 2). If a respondent answers the first question (knowledge of the college) and ticks one of the first two answers (a and b), it means the task the organization

136 Selected methods of studying a college’s image

is facing is to build greater awareness. On the other hand, respondents who are familiar with the company (answers c, d, e) can then be asked using the favour- ability scale about their attitude towards the university. Answers a and b on the favourability scale indicate a significant problem of the negative image of the college and suggest the need for further in-depth research that would provide detailed information on the image components assessed negatively (Kotler 2005).

Table 2 Sample scale of awareness and favourability towards a college

An example of the awareness scale An example of the favourability scale 1. My knowledge of college X: 1. My attitude towards college x is: a) I’m completely unfamiliar with a) very unfavourable b) I’ve only heard of b) rather unfavourable c) I know a little about c) indifferent d) I’m in a sense familiar with d) rather favourable e) I’m well familiar with e) very favourable

Source: author’s own research

The Trommsdorf model, in turn, is a technique for collecting opinions based on direct questions (Tab. 3). It allows to form an opinion about the ex- amined college, it also makes it possible to assess to what extent this opinion is consistent with visions of an ideal college and to what extent it deviates from them (Altkorn 2003).

Table 3 Trommsdorf model (sample question)

College X has good reputation. fully agree 1 2 3 4 5 6 7 fully disagree An ideal college is one with good reputation. fully agree 1 2 3 4 5 6 7 fully disagree

Source: author’s own research

The Fishbein model is similar to the above. In this case, the measurement is based on a subjective assessment of the probability of revealing a given feature and its significance for the respondent. Respondents must answer a range of similarly formed questions. The sum of the partial assessments makes up the overall image of an organization in the opinion of the study group (Tab. 4.).

137 Katarzyna Kowalska-Jarnot

Table 4 Fishbein Model (sample question)

Please rate the probability of the following: When choosing a college, relaying on its reputation is for you: highly likely 1 2 3 4 5 6 7 highly unlikely very important 1 2 3 4 5 6 7 not important at all

Source: author’s own research based

In the area of organization image, diagnosing semantic profiling is a com- monly used method. The questionnaire consists of a number of bipolar (usually seven-grade) scales marked with opposite terms (antonyms) that could describe an organization. The respondent marks a value on the numerical scale, which ac- cording to them reflects the intensity of a given feature. Wójcik (2013) underlines that in creating a semantic profile, it is possible to distinguish the following stages: – identifying, as a result of previous tests, (usually qualitative) the criteria / characteristics that a given target group is guided by when assessing the im- age of an organization; – opposing answers are created for selected (most frequently given) criteria, e.g. “renowned / not renowned” or “traditional / modern” college; – the researcher may attempt to reduce the number of evaluation criteria fol- lowing the recommendation of the method creator C.E. Osgood, who named three basic criteria and their corresponding scales: rating (good-bad), inten- sity (strong-weak), activity (active-passive); however, as noted above, this is only applicable when necessary, which means it applies as long as it meets the objectives of the study and the specificity of the examined organization (Altkorn 2003); – the respondent is asked to evaluate the subject of the image, but the ques- tionnaire is arranged in such a way that the negative grades are not found in one group and the positive ones in the other one; – the average values from the responses related to individual criteria are con- nected by a line that is intended to illustrate the overall image of the orga- nization under study among the group of respondents; – at the end, the discrepancy between the groups of respondents is identified (in the case when they represented different target groups). Basically, according to Mazurek-Łopacińska (2008) the number of degrees on the scale should not exceed 7 so as not to make it difficult for respondents to as- sess the intensity of a given feature. As J. Altkorn states, for this reason multistage scales are rather rarely used. The exceptions involve the case in which fragmentary images are used (fragmentary assessment of a given function) or one in which

138 Selected methods of studying a college’s image

assessed features have a large range and defining the degree of their intensity is not difficult (e.g. financial resources, market share). The results obtained in the study allow to identify the gap between the level of assessment of a given feature expected by the organization and the rating is- sued by respondents. The further stage of image designing requires asking the following questions for each of the criteria: – what contribution to overall image improvement would filling the gap to the indicated level provide? – what strategy could be used to fill the gap? – what would be the cost of filling the gap? – how long would it take to fill the gap? The profile can also be used to capture the differences between the following images: real – mirror – optimal and desirable. Such research conducted inside and outside a college allows you to “photograph” the image in various target groups, and then compare it with the desired image that the college is heading to. The distance between these images (gap width) can be reduced gradually by designat- ing smaller sections to change, the so-called optimal images, which are possible to achieve within a given period of time. In the case of image strategy such a policy of “small jumps” is optimal. It facilitates more thorough control of the process and does not result in discomfort related to rapid changes among the recipients. Another use of the profile is to compare the ratings of several competing organizations in terms of the same image attributes. It makes it possible to as- sess your position in relation to the competitors on each of the compared fields. Also in this case, the decisions related to levelling of the gap will be based on the composition of attributes in each of the compared organizations. If the college gets worse results than the competitors in the aspects which are not crucial in its case, and better ones in the aspects on which it builds its identity and image, then the research effect will not constitute a premise for change (Fig. 1).

Figure 1. College image research. Semantic differentiation profile (fragment) Source: author’s own research based

139 Katarzyna Kowalska-Jarnot

5. Assessment of the individual components of a college’s image

Image research can also take advantage of the I–D–U model (Importance– Delivery–Uniqueness) developed by J.R. Rossiter and L. Percy (1996). It involves assessing selected features of a college in terms of three main criteria. These include the criterion of importance of a given feature, the criterion of the degree of its implementation (delivery) and its uniqueness. In a way, it makes it possible to evaluate a given attribute in a three-dimensional manner. In the result table presented below, a college’s extensive education program has been rated quite high on the importance scale and low on the uniqueness scale. Competent staff including acclaimed scientists, well-known figures and authorities was in turn highly rated in terms of both importance and uniqueness. This clearly indicates the distinctive value of this attribute. The performance marks estimate the degree to which a college fulfils the promise related to particular attributes, and they provide a vital indication for corrective action (Tab. 5).

Table 5 Example results obtained in the I–D–U study

Selected feature Importance (I) Delivery (D) Uniqueness (U) Extensive 6.4 6.1 3.8 education program Renown lecturers 8.8 7.1 9.0 (acclaimed, authorities)

Source: author’s own study based on research carried out for the author’s doctoral dissertation, 2012

Creating perception maps is another method of testing the image and com- petitive position of a college in the awareness of stakeholders. A perception map is a two-dimensional coordinate system whose axes correspond to the adopted assessment measures. These measures constitute defined attributes (selected features) of the college, such as attractiveness of the offer, quality of education, prestige etc. As shown in Daszkiewicz (2008) due to the approach used in creating maps, we distinguish: – methods based on previously identified assessment dimensions, which include factor analysis, discriminant analysis and correspondence analysis methods; – methods not based on pre-defined assessment criteria, i.e. scaling methods.

140 Selected methods of studying a college’s image

The methods mentioned above differ in the ways of constructing assessment dimensions. Using the first approach, respondents assess a college according to pre-defined measures, and the grades obtained are later aggregated into more general measures (for example the quality of the dean’s office’s work). On the other hand, the purpose of scaling methods is to determine the position of the image of a given company as compared to images of the competitors, especially close ones, in the stakeholder’s minds (these methods allow the so-called sum- mary techniques, where image is assessed without a list of previously defined factors and all attributes are assessed collectively). Out of the mentioned methods of creating perception maps, the most com- monly used one is factor analysis. When using the factor method, respondents usually use 2 or 3 aggregate measures. This analysis determines whether the image of a college in terms of the tested attributes is better or worse than the average image in a given group (it assesses the organization in comparison to the whole community). A perception map makes it possible for various groups of stakeholders to assess the degree of competence (Fig. 2).

4,6

4,4 SGH

4,2

4,0 UEKr 3,8 Ak.L.K. UEPUEW 3,6

3,4

quality of education UEKa 3,2 AK.F.WSFiZ UŁa. 3,0 SWSPiZ 2,8

2,6 2,4 2,6 2,8 3,0 3,2 3,4 3,6 3,8 4,0 4,2 4,4 4,6 prestige

Figure 2. A two-dimensional perception map of 10 selected economic colleges in Poland – an assessment of two factors: prestige and quality of education Source: own study based on research carried out for the author’s doctoral dissertation, 2012

141 Katarzyna Kowalska-Jarnot

6. Education quality research

An important component taken into account when assessing the image of a college is the study of the quality of the educational service as assessed by stu- dents (Helgesen and Nesset 2007). The SERVQUAL analysis is among the meth- ods which may be used in this case. The basis of this method assumes that the student (client) has specific expectations towards the service. They result from their previous experience, knowledge and information. When the service is con- sumed, these expectations are confronted with reality, i.e. with the perception of the actual reception of the service. As a result, a gap representing the difference between the degree to which the expectations were met, and the perception of the service is created. The gap width provides the information about the degree of customer dissatisfaction. High quality of customer satisfaction is thus a situation where no gaps are found. Therefore, it is necessary to strive to reach total quality (TQ). The authors of the method, Parasuraman, Zeithaml and Berry, developed a model of five gaps in service quality. According to the servqual method (2019) these are the following: – Gap 1 – the discrepancy between what the management and the customers consider to be important in reaching product satisfaction, – Gap 2 – the discrepancy between the product concept and its specific imple- mentation in the project, – Gap 3 – the discrepancy between a product design and its actual material form, – Gap 4 – the discrepancy between a product and the advertising promises related to it, – Gap 5 – the discrepancy between the customers’ expectations and their perception of the product. The size of all the gaps listed above has an impact on the quality of the service provided, but only in the last case is the actual customer reception of the service assessed. Servqual analysis is used to examine the last, fifth gap. The questionnaire consists of a list of features related to the quality of the service. The first column contains the rating scale of the desired level of features of an “ideal” service expected by customers. In the second column, the respondent assesses the implementation of these features in reality. As suggested by the authors of the methods, the features relevant to a given service are grouped according to several dimensions as follows: – specificity – physical parameters, equipment and staff appearance, – service reliability – ability to deliver the service carefully and accurately, – willingness to cooperate – readiness to help recipients and provide them with punctual service,

142 Selected methods of studying a college’s image

– confidence – knowledge and kindness of employees and their ability to in- spire trust, – empathy – considerate, personalized service that the organization provides to the customers. Due to its universal character and the possibility of using criteria typical to each industry, the SERVQUAL method has also found application in investigating of educational services quality. In the literature, it is possible to find publications on the effects of using the method, including cases of language schools (Nierz- wicki and Rudzik 2003) and colleges (Targaszewska 2013). Below is an example of a fragment of the questionnaire used in the author’s own research carried out at 18 colleges of economics in Poland in 2012 (Tab. 6) The questionnaire includes 22 statements related to each of the five dimensions of the educational service. The questionnaire consisted of two parts and the statements were repeated in each of them. In the first one, students rated their expectations on a scale of 1–7, in the other one – they rated the actual implementation of the statement.

Table 6 Examples of statements that may constitute criteria for assessing the educational services quality using the Servqual scale (a fragment of a sample questionnaire)

I rate my expected I rate my current The criterion degree of satisfaction as degree of satisfaction as Lecturers’ knowledge 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Accessible way of transferring 1 2 3 4 5 6 7 1 2 3 4 5 6 7 knowledge The usefulness of the transferred 1 2 3 4 5 6 7 1 2 3 4 5 6 7 knowledge in future work Lecturers’ sense of duty 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Kindness of lecturers towards 1 2 3 4 5 6 7 1 2 3 4 5 6 7 students Individual approach to a student 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Share of practical classes 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Modern didactic methods 1 2 3 4 5 6 7 1 2 3 4 5 6 7 (e.g. e-learning) Availability of the dean’s office 1 2 3 4 5 6 7 1 2 3 4 5 6 7 Attractive college facilities 1 2 3 4 5 6 7 1 2 3 4 5 6 7 (e.g. teaching rooms, library) Appearance of college 1 2 3 4 5 6 7 1 2 3 4 5 6 7 administrative staff

Source: author’s own research

143 Katarzyna Kowalska-Jarnot

According to Kieżel (2008) there are many critics of the Servqual analysis. They point out that individual event-statements in research may be understood in different ways by the subjects of the survey and that it is difficult to remain unambiguous. Therefore, the number of points on the measuring scale is not par- ticularly important, as both the expectations and the perception are not precisely defined, and each client can interpret the content differently and quite freely. In addition, a creator of the survey to some extent implies their own perception of the services. Some researchers are of the opinion that in research on the quality of services it is enough to measure only the segment related to product percep- tion, and it is superfluous to measure the expectations. Such an assumption gives basis to the Servpref scale. It is a scale that consists of 22 statements measuring only the perception of a product (Sagan 2019). Another, albeit quite controversial, method of testing the quality of service is Mystery Shopping, which involves observing a provided service by unbiased, inde- pendent “agents”. The observation is carried out in concealment (i.e. an employee of the organization does not know that they are the object of the observation). The agent uses the services as a regular customer, and then completes the survey assessing specific aspects of the service. As shown in Dziadkowiec (2006) despite the obvious ethical doubts related to the secret nature of the observations (often interpreted as a lack of trust towards employees), the literature emphasizes the benefits of this research method. The introduction of an independent agent is de- signed to eliminate the factors of “emotion and acting in the heat of the moment,” which a customer may be vulnerable to. The agent is an external, impartial person who observes subsequent elements of the service in accordance with a previously prepared questionnaire. On the other hand, the employee does not know that they are being watched so they do not try to behave differently than usual. After the examination, the MS agent completes the survey. The criteria included in the survey should be adapted to the needs of the organization under study. In the case of higher education, it is recommendable to carry out MS research at a recruitment point, where independent, impartial persons claiming to be po- tential clients check the approach of the recruiters (Kulig 2006). Considering the fact that recruitment departments more and more frequently function (should function) as efficient sales departments, it is possible to examine whether the approach to the client is appropriate and professional. The previously identified relevant quality factors for this service will constitute a research “guide” (Tab. 7). One of the research techniques used within Mystery Shopping is the so-called secret contact, which is based on an analysis of the communication system (Pluta- ‑Olearnik 2008). Researchers contact the recruitment department using the avail- able communication channels (telephone, e-mail, website, instant messengers, conventional mail) and examine both the patency of communication channels (e.g.

144 Selected methods of studying a college’s image

connection speed) and the behaviour of the college employees (e.g. quickness or quality of response). The method has many advantages, although its opponents raise allegations which are mainly related to its secret nature.

Table 7 Criteria for the evaluation of first contact employees

Aspect assessed Assessment criteria number of first contact staff members required appearance of the employees First impression waiting time for service ability to meet individual requirements ability to listen Customer service skills asking questions to identify customer needs showing understanding knowledge about the products sold knowledge of the sales policy Communicated knowledge technical specifications knowledge about the product ability to answer product related questions willingness to search for information confidence building Interpersonal skills ability to communicate in a clear way establishing and maintaining eye contact empathy tailoring the offered solutions to individual needs Product recommendation suitability of the proposed solutions ability to present product advantages enthusiasm in presenting offers ability to encourage purchase Sales strategy ability to effectively present various options general professionalism

Source: (Finn 2001)

Job Satisfaction Inventory. Satisfaction surveys have a wider application apart from research among customers. From the point of view of a college’s identity and image, its employees appear to be an extremely important target group. In this case, these are the employees of administrative departments and lecturers. Certain foreign colleges consciously create an image of an excellent employer, remembering that an employee is the most immediate “brand ambassador”. In Poland, the situation in this respect is quite complex. A large number of universi- ties, low salaries of teaching staff and the demographic decline do not encourage

145 Katarzyna Kowalska-Jarnot

investing in staff, especially in the case of smaller and non-public institutions where the decrease of demand will mean a struggle for survival. Despite this, it is impossible to build a lasting competitive advantage on the market without the support of employees (Mruk 2004). In turn, efficient management of staff requires feedback on their satisfaction with their work. Acquiring the information in this case may take the form of an internal communication audit conducted in an informal or formal way using group and individual interviews (in particular with regard to professors). It is also possible to use a questionnaire. The JCI scale presented below can be used in education, for example for the purpose of examining administrative employees (Tab. 8). The JCI scale consists of 4 dimen- sions of job satisfaction defined by 14 Likert scale statements: – information flow (items 1–4), – diversity and sense of freedom (items 5–10), – achieving and carrying out tasks (items 11–12), – salary and a sense of security (items 13–14). The satisfaction score results from summing up marks and calculating aver- age scores in employee surveys for each dimension separately.

Table 8 A sample JCI questionnaire for colleges

1. I am satisfied with the information I receive Fully agree Fully disagree from my supervisor about my tasks 7 6 5 4 3 2 1 2. I receive enough information Fully agree Fully disagree about my tasks at work 7 6 5 4 3 2 1 3. I receive sufficient feedback Fully agree Fully disagree about the quality of my work 7 6 5 4 3 2 1 4. The college offers sufficient self-assessment Fully agree Fully disagree possibilities 7 6 5 4 3 2 1 5. I am happy with the variety of activities Fully agree Fully disagree in my job 7 6 5 4 3 2 1 6. I am happy to have sufficient sense of Fully agree Fully disagree freedom in what I do 7 6 5 4 3 2 1 7. I am happy with the possibilities of Fully agree Fully disagree communicating with others 7 6 5 4 3 2 1 Fully agree Fully disagree 8. Variety of work is wide enough 7 6 5 4 3 2 1 Fully agree Fully disagree 9. I have a lot of freedom in what I do 7 6 5 4 3 2 1

146 Selected methods of studying a college’s image

Table 8 cont.

10. My work gives me room for independence in Fully agree Fully disagree thinking and acting 7 6 5 4 3 2 1 11. I have the possibility to carry out tasks from Fully agree Fully disagree the beginning to end 7 6 5 4 3 2 1 12. My work provides me with a full possibility Fully agree Fully disagree to bring the tasks that I start to the end 7 6 5 4 3 2 1 13. I am happy with the salary that Fully agree Fully disagree I receive for my work 7 6 5 4 3 2 1 Fully agree Fully disagree 14. I have a sense of security at work 7 6 5 4 3 2 1

Source: (Sagan 2019)

Evaluation surveys. Typical methods of improving competences used in higher education include audits and evaluation surveys carried out after the end of a series of lectures, seminars or trainings. They contain sets of questions related to students’ satisfaction with the quality of classes conducted by a par- ticular lecturer. Thanks to the cyclical repetition of questions, research makes it possible to observe any changes in the work of teachers. Sample questions are presented in Table 9.

Table 9 A fragment of a sample evaluation survey examining classes as assessed by students

Fully Rather Difficult Rather Fully Questions disagree disagree to say agree agree 1 2 3 4 5 The teacher has clearly specified the expected criteria for passing the subject Classes begin with defin- ing a clear goal (what will be discussed, what is the intended effect) The schedule is consistent with the syllabus The content is presented in an interesting way

147 Katarzyna Kowalska-Jarnot

Table 9 cont.

Fully Rather Difficult Rather Fully Questions disagree disagree to say agree agree 1 2 3 4 5 Classes are held on time (start and end) The form of classes is attractive (the teacher uses case studies, references to history, current events, relevant examples, discussions) The teacher allows questions during lectures and provides answers to them Classes are conducted in an orderly manner – its individual parts are related and purposeful The teacher presents the content in a communicative way (clear, accessible)

Source: author’s own data

7. Examination of the effectiveness of a college’s PR activities

Public relations involves “building good relationships with various groups that may have an impact on the functioning of the organization by obtaining favour- able publications and other information in the media, creating a positive company image and responding in an appropriate way to unfavourable information and events related to the organization”. Thus, it is about earning social acceptance for the college through conscious and long-term building of a positive image. The most important aspects of college PR activities include: – media relations, – relations with employees (internal PR), – e-PR (image-building activities on the web), – visual identity. Research on the effectiveness of PR activities partly overlaps with the meth- ods listed above (qualitative research, college recognition research, employee

148 Selected methods of studying a college’s image

satisfaction survey). However, it is worth supplementing them with research methods typical to this field. They include: – website audit, – audit of the college’s image in social media, – qualitative analysis of content in the Internet (publications, mentions, opin- ions about the college, range and interaction), – audit of image materials, – examination of the college logo’s recognition, – other. In the Internet age, it has become necessary to take into account a posi- tive online college image. The Internet is a place of unrestricted exchange of opinions, offering the possibility of commenting and rating, while discussion forums, opinion-forming portals, social media and many other online channels have become a place where people share everything, including their experiences with brands. This means that image crises can spread much faster and be more difficult to control.

8. Summary

Building a positive image and reputation is an important strategy for the permanent university distinction in the education market in Poland. Although difficult to manage and measure, they are an important opportunity togain a competitive advantage because they are unique, difficult to imitate and increase the university’s chances of attracting more students. Continuous and systematic monitoring of image activities is important for several reasons. First of all, without ongoing monitoring, the probability of achieving the goals is reduced. Secondly, a college operating on such a demanding market must handle rapidly occurring changes and monitor their impact on the implementation of tasks. It may happen that between the formulation of an objective and its achievement appear circum- stances which interfere with the implementation process or even undermine its legitimacy. Thirdly, the control allows you to optimize costs and eliminate unnec- essary expenses. Finally, it allows one to detect and eliminate errors that could accumulate resulting in serious problems. Certainly, the most important reason is the very nature of the image itself. It is a changeable, vulnerable to influence phenomenon, and without ongoing observation it quickly crosses the boundaries of the assumed framework. The process of image (reputation) control is based on obtaining information from key stakeholders. For this purpose, a wide range of tools is used, of which several are described above.

149 Katarzyna Kowalska-Jarnot

References [1] Altkorn, J. (2004) Wizerunek firmy, Dąbrowa Górnicza: Wyższa Szkoła Biznesu. [2] Black, S. (2005) Public relations, Warszawa: Oficyna Ekonomiczna, Grupa Wolters Kluwer. [3] Budzyński, W. (2002) Zarządzanie wizerunkiem firmy, Warszawa: SGH. [4] Dziadkowiec, J. (2006) Wybrane metody badania jakości usług, Zeszyty Na- ukowe, nr 717, Kraków: Akademia Ekonomiczna, pp. 23–35. [5] Dowling, G.R. (2001), Creating corporate reputation, Oxford: Oxford Uni- versity Press. [6] Finn, A. (2001) ‘Mystery Shopper Benchmarking of Durable-goods Chains and Stores’, International Journal of Service Research, No. 3. [7] Helgesen, O. and Nesset, E. (2007), ‘Images, Satisfaction and Antecedents: Drivers of Student Loyalty? A Case Study of Norwegian University College’, Corporate Reputation Review, vol. 10, No. 1, pp. 38–59. [8] Jedynak, P. (2010) ‘The customer satisfaction survey process as the source of the organization’s knowledge’, in Jedynak, P. (ed.) Wiedza współczesnych organizacji. Wybrane problemy zarządzania, Kraków: Wydawnictwo UJ. [9] Kotler, P. (2005) Marketing, 14th edition, Poznań: Dom Wydawniczy Rebis. [10] Kulig, A. (2006) ‘Badania i analizy marketingowe w szkole wyższej’, in Nowa- czyk, G., Lisiecki, P. (eds) Marketingowe zarządzanie szkołą wyższą, Poznań: Wydawnictwo Wyższej Szkoły Bankowej. [11] Mazurek-Łopacińska, K. (2008) Badania marketingowe: metody, nowe tech- nologie, obszary aplikacji, Warszawa: Polskie Wydawnictwo Ekonomiczne. [12] Nierzwicki, W. and Rudzik, A. (2003), Pomiar jakości usług edukacyjnych metodą Servqual. Edukacja Ustawiczna Dorosłych, Nr 2, pp. 72–77. [13] Sagan, A. (2019) Skale jako podstawowy instrument pomiaru w badaniach satysfakcji i lojalności, [Online], Available: http://media.statsoft.nazwa.pl/_ old_dnn/downloads/skale.pdf [14 Nov. 2019]. [15] Servqual Method, [Online], Available: https://www.biostat.com.pl/metoda- servqual.php [15 Nov. 2019] [16] Targaszewska, M. (2013), Metody pomiaru jakości kształcenia na uczelniach wyższych, Zeszyty Naukowe UEK, Nr 923, pp. 59–71. [17] Wójcik, K. (2013), Public relations, Warszawa: Wolters Kluwer. [18] Zarębska, A. (2006), ‘Reputacja firmy – efekt zarządzania tożsamością orga- nizacyjną’. Przegląd Organizacji, nr 4, pp. 19–22. Managerial Economics 2019, vol. 20, No. 2, pp. 151–180 https://doi.org/10.7494/manage.2019.20.2.151

Jonas Rende*

Pairs trading with the persistence-based decomposition model

1. Introduction

It is challenging to accurately model the relationship between a set of se- curities, yet it is the nucleus of every trading strategy involving more than one security. Prominent examples are trading strategies based on statistical arbitrage. An extensively studied form of statistical arbitrage is pairs trading, which has been introduced to the scientific community by the seminal paper of Gatev et al. (2006). The underlying idea is to identify two stocks with a historically strong co- movement in prices. Next, the price difference between the two stocks, referred to as the spread, is tracked. If the spread diverges, i.e., crosses a pre-defined threshold level, the winner is sold short and the loser is bought. If the historical relationship holds, convergence occurs and results in a profit. The key require- ment for the strategy to be successful is the presence of exploitable mean-reverting patterns in the spread. A variety of methods have been proposed to model the potential mean-reversion embedded in the spread, ranging from model-free dis- tance methods, to cointegration techniques to complex time series approaches1. A sample of important contributions are Vidyamurthy (2004), Do and Faff (2012), Caldeira and Moura (2013), Rad et al. (2016), Liu et al. (2017) and Clegg and Krauss (2018). Krauss (2017) provides an overview about the pairs trading streams

* University of Erlangen–Nürnberg, Department of Statistics and Econometrics, Lange Gasse 20, 90403 Nürnberg, Germany, e-mail: [email protected]. 1 The distance-approach utilizes simple distance metrics to assess the degree of co-movement of stock pairs. The cointegration approach assumes that all shocks to the spread are solely transient and model the spread as a pure mean-reverting process (Engle and Granger 1987). Most of the studies applying the time series approach uses stochastic differential equations underlying different mean-reverting processes to model the spread.

151 Jonas Rende

in the literature – mostly utilizing daily data. In the last years, intraday statistical arbitrage attracts scientific interest on an increasing level. Next, we briefly review selected high-frequency pairs trading applications. Using U.K. FTSE 100 data for 2007, sampled on 60 minutes, Bowen et al. (2010) report significant excess returns of up to 19.80 percent per annum applying a distance-based pairs selection rule. The authors point out that returns are very sensitive to transaction costs and a waiting rule of one period to account for bid- ask spread. Miao (2014) uses a distance metric and cointegration tests to detect pairs suitable for a pairs trading application on 5 minute bins for 177 selected U.S. stocks between 2012 and 2013. The author finds annual returns of up to 56.85 percent. Focusing on 5 minute bins for U.S. oil stocks (2008, 2013–2015), Liu et al. (2017) achieve annual returns of 187.80 percent applying stochastic differential equations to model the spread’s mean-reversion. Recently, Stübinger and Endres (2018) apply a mean- reverting jump diffusion model to S&P oil stocks using minute-by-minute data from 1998 to 2015, reporting annual returns of 60.61 percent. In addition, the authors find supporting evidence for the exis- tence of jumps in the high-frequency data. Using the same data set as Stübinger and Endres (2018), but without restricting themselves to oil companies, Stübin- ger and Bredthauer (2017) achieve returns of up to 50.50 percent per year with simple distance metrics. Mikkelsen and Kjærland (2018) apply the distance and cointegration approach to 100 stocks of the commodity dominated Oslo Stock Exhange from January 2012 to March 2016. For the cointegration approach the authors report annualized returns of 25.1 percent, wheareas for the distance ap- proach the best specification yields annualized excess returns of 17.8 percent. Other noticeable studies are Dunis et al. (2010), Kim (2011), Kishore (2012), Bowen and Hutchinson (2016) and Endres and Stübinger (2017). Recently, Rende et al. (2019) propose the persistence-based decomposition model (PBD), as one that adapts well to noisy high-frequency data. The PBD model decomposes the spread of two time series into three components, captur- ing different levels of shock persistence, namely infinite, finite and no persistence. Thereby, the infinite persistence component is a random walk, the finite persis- tence component is modeled as a stationary AR(1) process and the no persistence component is pure noise. To evaluate the model in terms of goodness of fit and predictive power, Rende et al. (2019) apply the model to S&P 500 minute-by- minute data, starting from January 1998 and ending in November 2016. Compared to suitable benchmark models the PBD model provides the best fit for the spread between two stocks in 35.38 percent according to the BIC (Schwarz 1978). The author find that the model is superior in light of directional accuracy and RMSE compared to the partial cointegration (PCI) model (Clegg and Krauss 2018), an AR(1) model and a naive approach. The predictive edge of the PBD model is

152 Pairs trading with the persistence-based decomposition model

based on the model’s ability to exploit mean-reverting patterns in the presence of noise – the same mechanism underlying a successful pairs trading strategy. Thus, from a trading perspective the finite persistent stationaryAR (1) component reflects potentially tradable mean-reversion, whereas the non-persistent component cap- tures non-tradable mean- reversion. With the exception of Bowen et al. (2010), Bowen and Hutchinson (2016) and Liu et al. (2017) the baseline in the current high-frequency pairs trading literature is to execute trades without an execution gap. Leaving out an execution gap leads to delay-zero alpha (Kakushadze 2016) returns which cannot be realized in practice: As high-frequency data are subject to microstructure noise. Aït-Sahalia and Yu (2009), backtesting without waiting rules is exposed to trading the bid-ask bounce. Second, due to signal processing it is not possible to generate the signal and execute the order simultaneously. In total, we make three contributions to the current high-frequency pairs trading literature: First, we put the PBD model to test in a large scale pairs trad- ing application on the same data set as in Rende et al. (2019), using a rigorous backtesting framework. As benchmarks, we choose the PCI model and a buy-and- ‑hold strategy. Second, to gain a better understanding of the model mechanics, we analyze the performance over time, the exposure to common risk factors and compare parameter and industry profiles for all pairs and the most profitable pairs. Third, we quantify the impact of execution limitations on high-frequency pairs trading returns by relaxing the limitations of our backtesting framework. Our findings are as follows: First, backtesting the PBD model on S&P 500 minute-by-minute data yields a statistically significant and economically meaning- ful average annual return after costs of 9.16 percent compared to -14.41 percent for the PCI model and 4.38 percent for a buy-and-hold strategy. Thereby, the backtesting engine relies on the following trading costs and execution limitations: (i) We assume trading costs of 0.20 percent per full turn per pair (Avellaneda and Lee 2010). (ii) We implement an execution gap of one period (entry and exit) to account for bid-ask spread (Gatev et al. 2006). (iii) An annual short-selling fee of 1 percent p.a. is charged (Do and Faff 2012). (iv) If for one of the stocks forming a pair, the volume at the time of execution is zero (open and closing trigger), we delay the execution of the order to the next period where volume is larger than zero for both stocks. The PBD model is superior in terms of distributional charac- teristics, risk measures as well as risk and return measures. Second, an evaluation of the strategy’s development over time shows that it performs exceptionally well in bear markets, such as the financial crisis, due to the good performance of the short-leg. To the best of our knowledge, this manuscript is the first pairs trading study documenting the latter fact for the U.S. market on minute-by-minute data. The strategy is mostly robust to common sources of systematic risk. For the most profitable pairs the share of industries typically associated with a similar business

153 Jonas Rende

model is high. Thus, industry affiliation is an important return driver. Furthermore, the most profitable pairs exhibit strong pronounced mean-reverting patterns and simultaneously less pronounced permanent effects. Third, relaxing execution limitations drastically increases returns and affects all risk and returns measures in a favorable way. If order execution is not restricted, we find annual returns after costs of 138.6 percent for the PBD model, indicating that delay-zero alpha (high-frequency) returns are upward biased with respect to practical feasibility. The paper is organized as follows. In section 2, we describe data and software. In section 3, we outline the PCI and the PBD model. In section 4, we provide an in-depth explanation of our backtesting procedure. In section 5, backtesting results and the impact of execution limitations on returns are presented. Finally, section 6 provides some concluding remarks.

2. Data and software

2.1. Data and pre-pocessing Following Krauss et al. (2017), backtesting is performed on S&P 500 index constituents. Due to its market efficiency, high liquidity2 and intensive academic, analyst and investor coverage the S&P 500 serves as the litmus test for every trading strategy. In order to eliminate survivor bias we proceed in line with Krauss et al. (2017) and download month end constituents lists for the S&P 500 from Thomson Reuters Datastream, ranging from January 1998 until November 2016. First, the monthly constituents list are transferred into a matrix, covering in a binary fashion if a stock is an index member or not. Second, for every stock having ever been a S&P 500 constituent, minute-by-minute close price informa- tion as well as trading volume data are downloaded from QuantQuote starting in 1998 and ending in November 2016 (QuantQuote 2016). Prices are adjusted for splits, dividends and special corporate events, e.g., symbol changes, mergers and acquisitions (QuantQuote 2012). To deal with missing prices, we forward fill close prices between the first and the last available price for every stock. Missing volume information is filled with zeros. In accordance with U.S. trading hours, we solely consider price and volume information between 9.30 AM and 16.00 PM. In addition, the industry classification according to the Global Industry Classification Standard (GICS) is downloaded for every constituent during the considered time frame. Every stock is matched with its industry.

2 The S&P 500 covers around 80 percent of the available U.S. market capitalization (S&P 500 Dow Jones Indices, 2015)

154 Pairs trading with the persistence-based decomposition model

2.2. Software Our high-frequency (HF) trading backtester is written in R (R Development Core Team 2018). Risk and return metrics, such as the Sharpe ratio (Sharpe 1994), are calculated using the implementations in the R package PerformanceAnalytics. Those metrics need proper time series objects as provided by the package xts. To store and access the data in HDF5 containers and access them we use the package rhdf5 (The HDF Group 2010). Data manipulation such as slicing and merging has been done in the data.table framework. We use R’s base implementation parallel for parallel processing. A full overview of the R packages, their authors and the purpose is provided in Table 1.

Table 1 R packages involved in the backtester

Name Author Purpose xts Ryan and Ulrich (2018) Converting time series objects Performance Peterson and Carl (2018) Risk and return metrics Analytics Reading and writing HDF5 rhdf5 Fischer and Pau (2017) container partialCI Clegg et al. (2018) Fitting a PCI model parallel R Development Core Team (2018) Parallel processing data.table Dowle and Srinivasan (2018) Data handing FKF Luethi et al. (2018) Fast Kalman filter estimation Runtime calculation and bench- tictoc Izrailev (2014) marking Pre-compile functions compiler R Development Core Team (2018) to improve run time

3. The models

Next, we will briefly describe the models our pairs trading strategies build up on: First, our baseline is the partial cointegration model of Clegg and Krauss (2018). Second, the persistence-based decomposition model contributed by Rende et al. (2019).

155 Jonas Rende

3.1. The partial cointegration model Compared to classic cointegration (Engle and Granger 1987) partial cointegra- tion accounts for transient and permanent effects. Especially in financial applica- tions, the assumption that the spread between two securities is solely driven by short-term deviations from a long-run equilibrium is too restrictive. To overcome this limitation, the model incorporates idiosyncratic shocks to the spread time series. Examples for such shocks can be a successful management, technological advantages as well as analyst upgrades (Clegg and Krauss 2018). In the partial cointegration framework two securities are connected by a partially autoregressive (PAR – see Poterba and Summers (1988) and Clegg (2015)) process consisting of the sum of a random walk and a stationary AR(1) process, i.e., the absolute value of the AR(1)-coefficientρ P CI has to be smaller than one. The model equations are given as,

PCI PCI YXt =+b ttZ ZMPCI =+PCI RPCI t t t (1) PCIPCI PCI PCI MMt =+ret-1,Mt PCI PCI PCI Rt =+Rt-1 eRt,

PCI PCI The error terms of the two components Rt and Mt are assumed to be mutually independent normally distributed zero-mean white-noise processes. Model parameters are the cointegrating coefficientβ PCI, the variances of the error PCI terms of the transient component (σM ) as well as the random walk component PCI PCI (σR ) and ρ . From a trader’s perspective, the mean-reversion, captured by the M component, reflects potential trading opportunities and is therefore of special interest. Due to orthogonality, the model allows to easily calculate the proportion 2 of variance attributable to mean-reversion (RM), allowing to assess the degree of mean-reversion embedded in a time series. As neither the spread nor the distinct components are observable, estimation is solely possible in state space. The authors derive the state space representation and show that Maximum likelihood estimates of the associated Kalman filter are consistent and that the model is identified. As an initial showcase, the PCI model is applied to S&P 500 daily data starting from 1990 and ending in 2015. The aim of the authors is to construct a relative- value arbitrage strategy relying on the mean-reverting component of the price spread. The distance approach, as proposed by Gatev et al. (2006), and the coin- tegration approach (Caldeira and Moura 2013) serve as benchmark methods. With monthly returns after costs of 1 percent the PCI model outperforms the benchmark returns of the distance approach (0.1 percent) and cointegration (0.1 percent and 0.15 percent depending on the specification) by far. Therefore, the PCI model serves as the baseline for our high-frequency pairs trading application.

156 Pairs trading with the persistence-based decomposition model

3.2. The persistence-based decomposition model

As pointed out by Rende et al. (2019), the relationship between securities, paradigmatically represented by the spread, might be subject to shocks exhibiting different levels of persistence, namely infinite-, finite-, and non-persistent shocks. Examples for finite shocks can be uninformed trading as well as the short-term price impact of trades, while a prominent examples for shocks with no persis- tence is microstructure noise due to micro-level market frictions (Rende et al. 2019). Within the PBD time series model the three orthogonal components are modeled by a random walk (infinite persistence), a stationary AR(1) process with coefficientρ PBD (finite persistence) and pure noise (no persistence). The structural model equations are as follows:

PBD PBD YXt =+b ttZ PBD PBD PBD PBD ZMt =+t RWt + t PBDPBD PBD PBD MMt =+ret-1,M,t (2) PBD PBD PBD RRt =+t-1 eRt, PBD PBD Wt = eWt, The full PBD model consists of five parameters. The spread coefficient βPBD, PBD the variance of the stationary AR(1) process (σM ), the variance of the random PCD PBD PBD walk (σR ), the variance of the pure noise (σW ) and ρ . The authors define multiple components to quantify a spread’s mean-reversion, each attributable to 2 a different source: First, RM, representing the proportion of variance attributable 2 to finite-persistent mean-reversion. Second, RW the mean-version attributable to 2 non-persistent mean-reversion. Third, RMW the total mean-reversion which is the 2 2 sum of RW and RM. Given that we cannot observe the components, the model has to be transferred into state space. Rende et al. (2019) derive the likelihood of the associated Kalman filter and elaborate on statistical properties such as identifica- tion and consistency of the (quasi) Maximum likelihood estimators. To gain a better understanding of the PBD model, Rende et al. (2019) inves- tigate if, compared to suitable alternatives, the PBD model fits well to minute- by-minute price spreads for S&P 500 constituents from 1998 to 2016. According to the Bayesian information criterion (Schwarz 1978) and a likelihood ratio test routine for the PBD model in 35.38 percent of the 2 333 202 analyzed pairs the PBD model helps to accurately model the spread. In addition, the out-of-sample prediction power is investigated. In terms of directional accuracy and root-mean- square error the PBD model is superior compared to the PCI model, a stationary AR(1) model and a naive approach. The latter findings are stable over time.

157 Jonas Rende

Based on the findings of Rende et al. (2019), we hypothesize that in a high- ‑frequency pairs trading application the PBD model adapts better to the noisy data than the PCI model. To evaluate our hypothesis, we develop a relative-value arbitrage strategy in the light of Clegg and Krauss (2018) building on the M com- ponent of the PBD model extracted from the spread.

4. The backtester

4.1. Building blocks, computational challenges, and execution limitations We implement a backtester to compare the trading results of PCI and PBD. First, we provide an overview about the different building blocks, com- putational challenges, and execution limitations. Then, we provide a detailed description of the different building blocks. Table 2 provides a summary of the main parameters and selection criteria.

Table 2 Summary of the restricted backtesting framework

PCI PBD Number of non-overlapping study periods 949 949 Formation period, in days 5 5 Trading period, in days 5 5 Eligibility criterion 1 Same industry Eligibility criterion 2 Top 5 correlated stocks Eligibility criterion 3 LRS in bottom 5% Eligibility criterion 4 0.5 < ρPCI < 1 0.5 < ρPBD < 1 2 2 Eligibility criterion 5 R MR > 0.5 R M > 0.5 Portfolio size, in pairs 10 Portfolio selection In-sample Sharpe ratio

Opening signal τo 2.0

Closing signal τc -2.0 Stop loss, in percent 10 Transaction costs, in percent 0.05 Execution gap, in periods (minutes) 1

Y X Volume constraint (open) (νo > 0 & νo > 0) Y X Volume constraint (close) (νc > 0 & νc > 0)

158 Pairs trading with the persistence-based decomposition model

Study periods: We split our data set into 949 non-overlapping study periods (Papadakis and Wysocki 2007). Each study period consists of a five days formation period (FP) and a five day trading period (TP)3. Thereby, five days correspond to one trading week. In the formation period the PCI and PBD models are calibrated and the most suitable according to a variety of eligibility criteria (see Table 2 eli- gibility criteria 1 to 5) are traded in the subsequent trading period. Return computation: We calculate our returns in line with Gatev et al. (2006). For all pairs, we scale the sum of the payoffs at the end of a trading day with the total amount of the invested capital at the end of the previous trading day. This approach results in a daily return time series. Regarding the invested capital we solely consider committed capital, i.e., 1 U.S. dollar (USD) is spend on every pair at the beginning of the trading period if a position is opened or not. Consequently, we calculate the return on committed capital. Execution limitations and transactions costs: Our rigorous backtesting framework relies on the following execution limitations and transactions costs: 1. Execution gap: In line with Gatev et al. (2006), we account for bid-ask spread and time for signal processing by shifting an incoming trading signal (entry and exit) at the end of minute t to the end of minute t + 1. 2. Volume constraint for opening / closing a position: We only open or close a position if at time t + 1 there is positive trading volume for both stocks Y X Y X Y and X, i.e., (νo > 0 & νo > 0) and (νc > 0 & νc > 0), respectively. Thereby, i i νo (νc) denotes the volume of stock i at opening (o) (closing (c)), where i = {Y, X}. If the volume conditions are violated we postpone the execution of the order to the next minute where the condition is fulfilled. If during the trading period of interest the condition is never met, the order is canceled. 3. Transaction costs: Following Avellaneda and Lee (2010) we assume transaction costs of 0.05 percent (5 basis points) per half turn per stock, i.e., 0.20 percent per full turn per pair. 4. Short-selling fees: The short leg of a pair is charged an additional fee for the days ∆d.o a pair is open. We follow Do and Faff (2012) and assume a conserva- tive fee of 1 percent per annum payable for the open days ∆d.o scaled by 360: ∆do.. 00. 1⋅ . If 0 < ∆d.o ≤ 1 we charge one day, i.e, we charge a minimum 360 1 fee of 00. 1⋅ . 360 Computational challenges: In our large scale HF application, the size of 65 GB for the raw data imposes a real challenge with respect to RAM handling in the

3 In a trading context, the term day refers to trading days.

159 Jonas Rende

context of parallel processing. To deal with RAM constraints, we cut the data into nineteen yearly time-slices, each consisting of approximately 50 batches, where each batch corresponds to a study period, and store them in a HDF5 container. For further analysis, those batches are distributed among five machines consisting of 80 cores (2.4 GHZ to 3.6 GHZ) as well as 206 GB RAM. The jobs are processed in parallel. In this setting, it takes 14 days to fit one specification of the PBD model and 8 days to fit one specification of the PCI model to all 2333202 pairs. Trading takes an additional two days for each model and each specification.

4.2. Formation Available stocks and pairs: Stocks are treated as available if (i) the stocks are a S&P 500 constituent at the last day of the formation period and (ii) no price information are missing within the formation period. A pair is available if the two stocks forming that pair are individually available. Overnight jumps: Abnormal changes occurring overnight are a well-known phenomenon in high-frequency stock prices (Stübinger and Endres 2018). We replace every price time series by a cumulated return time series, where overnight changes are set to zero (Rende et al. 2019). This approach guarantees that our maximum likelihood estimators are not harmed by overnight jumps (Liu et al. 2017). Consequently, the fitting procedures are carried out on the adjusted time series. Eligible pairs: Within the 949 study periods we fit a PCI and a PBD model to 2 333 202 pairs fulfilling the eligbility criteria 1 and 2 (Tab. 2), i.e., 2458.59 pairs on average per formation period. Those available pairs are divided into eligible and non-eligible pairs. A pair is considered as eligible if the subsequent five eligibility criteria hold simultaneously – see Table 2. First, the stocks have to be in the same industry according to the GICS industry classification. The latter is a common requirement in pairs trading applications (Gatev et al. 2006). Second, a higher correlation between stocks is associated with a higher chance of co- movement (Gatev et al. 2006; Chen et al. 2017). Therefore, we only consider for stock i the five industry partner stocks exhibiting the highest correlation (Rende et al. 2019). In addition, limiting the number of available pairs renders our high- frequency pairs trading study computationally feasible. Third, both models rely on maximum likelihood estimation, i.e., it is possible to calculate a likelihood ratio score (LRS) between reasonable nested alternatives and the full PBD or full PCI model, respectively. Turning to the PCI model a reasonable alternative is the PCI PCI pure random walk model (σM = ρ = 0), while for the PBD model the random PBD PBD walk with noise model (σM = ρ = 0) is a suitable alternative. Because lower LRS are associated with a higher chance of following the full model rather than

160 Pairs trading with the persistence-based decomposition model

the alternative we only consider pairs with a LRS located in the five percent quan- tile. Fourth, we demand 0.5 < ρm < 1, where m ∈ {PCI, PBD}. The rationale is that a larger ρm is associated with a higher half-life or mean-reversion and, thus higher trade duration potentially reducing bid-ask bounce effects. Fifth, potentially exploitable mean-reversion – as represented by the variance attributable to the stationary AR(1) process mean-reversion – should be pronounced in the spread 2 2 time series. Specifically, we aim for RM > 0.5 for the PBD model and RMR > 0.5 for the PCI model. The eligibility criteria three to five are inspired by Clegg and Krauss (2018). Top pairs: We let all eligible pairs participate in an in-sample trading and calculate the respective Sharpe ratio – see Bertram (2010), Dunis et al. (2010), Caldeira and Moura (2013), Clegg and Krauss (2018). Next, for each stock i its eligible partner stocks are ranked in descending order according to their in-sample Sharpe ratio. We refer to stock its partner with the highest in-sample Sharpe ratio as its trading partner (Clegg and Krauss 2018). We extract a list solely containing stocks and their trading partners. A portfolio consisting of the top 10 pairs (Miao 2014, Stübinger et al. 2018) in the sense of the largest in-sample Sharpe ratio is then traded in the the following trading period.

4.3. Trading As a relative-value arbitrage strategy relies on mean-reverting patterns, we monitor a transformation of both models M component (see – equation (3)) dur- ing the trading period to detect trading signals. We do not monitor the spread itself as typical in the distance approach (Gatev et al. 2006) or the cointegration approach (Caldeira and Moura 2013) because the spread is non-stationary by construction in our case. To be specific, we calculate the standard score for both models for every minute t in the five days trading period given as

mT, P mT, P Mt St = mF, P (3) s M where m = {PCI, PBD}. It is common to assume that the estimated in-sample spread relationship between two securities carry over to the out-of-sample trad- ing period. Hence, the parameters needed to calculate the standard scores in the trading period are estimated in the formation period using the model equations (1) and (2), respectively. Loosely speaking, a trading signal is a sufficiently large deviation of the M component from its mean value. In mathematical terms, sufficiently large

161 Jonas Rende

m,TP means |S | > τo , i.e., the standard score crosses the opening threshold τo. Fol- lowing Gatev et al. (2006), Huck (2015), and Rad et al. (2016) we opt for τo = 2. A position is opened by going long the undervalued security and shorting the overvalued stock. If |τo| is crossed while a position is already open we do not take an action. An investment of 1 USD in the long position is offset by selling short the price ratio of the two securities forming the pair weighted with the in- sample spread coefficient βm. In general, the latter will differ from 1 USD. The portfolio value might change minute-by-minute depending on whether we open a position as reaction to a trading signal. When a trade is entered the maximum amount of capital is invested. The lower bound of the investment is a stop-loss rule of 10 percent, i.e., if the portfolio value drops below 90 percent of its ini- tial value the pair is excluded from any further trading activities (Nath 2003, Caldeira and Moura 2013). An upper bound is not implemented. If no position is open the portfolio is held in cash, i.e., we assume that there are no interest m,TP earnings. Open positions are closed if |S | < τc, where τc denotes the closing threshold. We choose symmetric thresholds and set τc = −2 (Bertram 2010). In addition, a closing action is forced if a stock is not listed anymore or if the end of trading period is reached and a position is still open. The last available price is then used to calculate profits. The PBD model, as well as the PCI model, can notch up losses from converging trades, due to the behavior of the random walk component. By contrast, the distance and the cointegration approach always earn profits from converging trades.

5. Results

In this section, we evaluate the performance of the PBD, the PCI model and the general market (MKT) from a financial perspective. The latter isa simple S&P 500 buy-and-hold strategy. We proceed as follows: First, we analyze the per- formance in the light of our backtesting framework outlined in section 4. Second, we relaxing the execution limitations step-by-step to gain a better understanding of how the restrictions affect the trading results.

5.1. Backtesting results

5.1.1. Risk and return characteristics First, we pin down the performance of the different approaches in terms of a variety of daily and annualized risk as well as return measures. Table 3 sum- marizes the results.

162 Pairs trading with the persistence-based decomposition model

Panel A – daily return metrics: With a value of 0.04 percent per day, the PBD model yields the highest mean return after transaction costs compared to PCI (-0.06 percent) and the general market (0.02 percent). For the PBD and the PCI model the return values are highly significant. The large number of trades explains the fairly low standard deviations of 0.0076 for the PBD model, 0.0103 for the PCI model and 0.0124 for MKT, respectively. Moreover, the PBD model exhibits the highest hit rate (Share > 0) with a value of 54.56 percent compared to 47.28 percent for the PCI model and 53.01 percent for MKT. While the PCI approach and the MKT are skewed to the left, the daily return time series of the PBD model is skewed to the right. The latter indicates that the right tail is more pronounce which is beneficial from an investors perspective (Cont 2001). Kurtosis values of 10.46 for the PBD model, 7.54 for the PCI model and 7.75 for the MKT are associated with a notable risk in the tails of the distribution. This leptokurtic behavior is well in line with the stylized facts of financial time series (Cont 2001).

Table 3 Daily and annualized risk-return metrics – restricted backtesting. Panel A depicts daily return characteristics, panel B depicts risk and panel C annualized risk-return metrics for the the persistence-based decomposition model (PBD), the partial cointegration model (PCI) and the general market (MKT)

PBD PCI MKT A Mean return 0.0004 -0.0006 0.0002 Standard Error (Newey-West) 0.0001 0.0002 0.0002 t-Statistic 3.0886 -2.9522 1.5479 Minimum -0.0583 -0.0881 -0.0903 25% Quantile -0.0031 -0.0051 -0.0055 Median 0.0001 -0.0003 0.0005 75% Quantile 0.0035 0.0043 0.0061 Maximum 0.0859 0.0794 0.1158 Share > 0 0.5456 0.4728 0.5301 Standard dev. 0.0076 0.0103 0.0124 Skewness 0.6063 -0.3100 -0.0272 Kurtosis 10.4654 7.5358 7.7488

163 Jonas Rende

Table 3 cont.

PBD PCI MKT B Hist. VaR 1% -0.0313 -0.0447 -0.0515 Hist. CVaR 1% -0.0282 -0.0432 -0.0483 Hist. VaR 5% -0.0091 -0.0169 -0.0184 Hist. CVaR 5% -0.0162 -0.0260 -0.0293 Maximum drawdown 0.4637 0.9813 0.5678 C Return p.a. 0.0916 -0.1441 0.0438 Standard dev. p.a. 0.1200 0.1639 0.1974 Downside dev p.a. 0.0776 0.1229 0.1396 Sharpe ratio p.a. 0.7630 -0.8792 0.2217 Sortino ratio p.a. 1.1793 -1.1721 0.3135

Panel B – daily risk metrics: In terms of all risk metrics the PBD model is favorable to the benchmark approaches. The values for the 1 and 5 percent value at risk (VaR) and the corresponding conditional versions (CVaR) indicate that the PBD approach exhibits a lower tail risk. An an example compare the 1 percent VaR of -0.0313 for the PBD model to -0.0447 for the PCI model and -0.0515 for the MKT. In addition, the exposure to drawdown risk, as measured by the maximum drawdown is found to be lower for the PBD model. Panel C – annualized risk metrics: With an average annualized return of 9.16 percent after transaction costs the PBD model outperforms the general market (4.38 percent) and the PCI model (-14.41 percent) by far. The higher annualized returns are achieved with a significantly smaller standard deviation which translates to the highest annualized Sharpe ratio of 0.7630 for the PBD model. For the general market the Sharpe ratio is 0.2217 and the PCI model ex- hibits a negative Sharpe ratio of -0.8792. If we solely focus on downside risk as reflected by the downside deviation and the Sortino ratio, the advantage of the PBD model compared to the PCI model and the MKT is even more pronounced. This is in line with the positive skewness and a higher rate as described in panel A.

5.1.2. Trading statistics Next, we compare the PBD and the PCI approach in light of trading statistics – see Table 4). The average number of pairs traded per 5-days trading period for the

164 Pairs trading with the persistence-based decomposition model

PBD model is 9.824, i.e, nearly all of the 10 pairs are traded. Turning to the PCI model, only about half of the 10 pairs are traded on average. A deep-dive analysis on the latter point yields that the method is not able to identify a sufficiently large number of eligible pairs. Panel B shows descriptive statistics of the number of eligible pairs identified across all 949 study periods. Note that due to eligibility criterion 3 (Tab. 2) the maximum number of eligible pairs is 5 percent of the pairs fulfilling criteria 1 and 2, i.e., on average 2458.59 ⋅ 0.05 ≈ 123 pairs. Out of the approximately 123 pairs solely 7.55 pairs are eligible on average for the PCI approach compared to 30.26 for the PBD approach. This low mean value combined with a median value of 5 for the PCI method explains the difference in the average number of pairs traded per 5-days period. Interestingly, the trading frequency, quantified by the average number of round-trip trades per pair, is more than twice as high for the PCI model (14.78) compared to the PBD approach (6.52). Consequently, the average time pairs are open per round-trip is significantly smaller for PCI. On aver- age pairs are open 0.25 days for the PCI approach, while this value is 0.63 for the PBD model. The share of trades where the volume constraints are violated is about 6 percentage points larger for the PBD model and average waiting time is longer (4.9 minutes versus 3.8 minutes). The latter findings provide a first indication that volume constraints might affect returns. With a value of 94.82 percent the share of converging trades is very high for the PCI model. For the PBD model the share of converging trades is about 8 percentage points lower.

Table 4 Trading statistics and eligible pairs – restricted backtesting. Panel A depicts summary statistics of the number, the duration of trades, the volume constraints and the distribution of converging and diverging trades, and panel B shows descriptive statistics for the number of eligible pairs across the 949 study periods for the the persistence-based decomposition model (PBD) and the partial cointegration model (PCI)

PBD PCI A Average number of pairs traded per 5-days period 9.8240 5.0474 Average number of round-trip trades per pair 6.5168 14.7762 Standard deviation of number of round-trip trades per pair 6.1182 10.6481 Average time pairs are open per round-trip, in days 0.6310 0.2477 Standard deviation of time pairs are open per round-trip, 1.0037 0.8648 in days Share of trades with zero volume 0.2204 0.1460

165 Jonas Rende

Table 4 cont.

PBD PCI Average time a trade is delayed due to zero volume, in 4.9233 3.8097 minutes Share of converged trades 0.8659 0.9482 Share of trades where closing is forced, stop-loss 0.0044 0.0022 Share of trades where closing is forced, end of data 0.1297 0.0496 B Minimum 0.00 0.00 25% Quantile 20.00 1.00 Median 30.00 5.00 Mean 30.26 7.55 75% Quantile 40.00 11.00 Maximum 87.00 56.00 Share study periods with zero eligible pairs [%] 2.00 15.00

The analysis of trading statistics at least partly explains why PCI returns are not robust: First, the low number of eligible pairs is associated with a lower chance of detecting profitable pairs. Second, the duration of round-trip trades for the PCI model might be not long enough for an exploitable relationship to establish. Third, the high number of converging trades along with the low aver- age return could be an indication that the random walk component develops in an unfavorable way harming profits.

5.1.3. Performance over time Pairs trading profits are known to decline over time – see Do and Faff (2010), Bowen and Hutchinson (2016) and Stübinger and Endres (2018). Figure 1 tracks the cumulative profit (committed capital) of a 1 USD investment in the PBD ap- proach (black), the PCI approach (blue) and the general market (red) from 1998 to 2016. With exception of the early years the PBD approach is superior to the performance of PCI and the general market. For the PBD approach we observe a M-pattern, with peak periods in bear market phases. The first peak period starts 2001 and ends 2003. This period is associated with decimalization, the emergence of the dot-com bubble, the September 11 attacks and the beginning of the Iraq war. The second peak corresponds with the financial crisis. The good performance

166 Pairs trading with the persistence-based decomposition model

of pairs trading strategies in market phases exhibiting bearish conditions is well- documented for low- and high-frequency data – see Do and Faff (2010), Bowen and Hutchinson (2016), Liu et al. (2017), Clegg and Krauss (2018) and Stübinger and Endres (2018). The reason is that, by construction, the strategy is liquidity providing (Do and Faff 2010). In the period ranging from 2001 to 2003 the PCI model also achieves positive profits but at a significantly lower level than the PBD model. In addition, the PCI model cannot capitalize on the financial crisis. The general market shows a weak positive trend. After 2010, profits for the PBD and the PCI model start to decline, while the general market shows a positive trend. As pointed out by Clegg and Krauss (2018), declining pairs trading profitability is most probably due to an increase in market efficiency due to the wide-spread adoption of quantitative strategies, such as pairs trading. At the end of period covered in our data, the PBD model yields a cumulative profit of 4.05 USD, whereas the PCI model yields a cumulative profit of -0.91 USD. For the general market this value is 0.45 USD. The evaluation of Figure 1 shows that the PBD model performs very well in bear markets. We would expect that in bear markets especially the short leg contributes to the profitability of the strategy. For lower frequencies, supporting evidence is provided by Bowen and Hutchinson (2016). By contrast, for high- frequency data only little is known about the contribution of the long and short leg at different market phases. At the time of writing, there is no high-frequency study providing empirical evidence for the U.S. market4. To evaluate the hypoth- esis, we separately plot the financial performance (Fig. 2) of the long and short leg (black) as well as of the long (red) and the short leg (blue). Before 2001, with the exception of the 2000s, both legs roughly provide the same contribu- tion to the total profit. Starting from the end of 2001 to the beginning of 2016, the short leg yields higher profits than the long leg. During this time span, the USD development of the short leg has two peaks, which coincide with the two peaks of the USD development of both legs together, i.e., the two prominent bear market periods. The latter supports the hypothesis that within bear market phases the short leg contributes more to the overall profit than the long leg for high-frequency data. Note that from 2010 to November 2016, short-selling prof- its decrease, while profits due to the long position show a positive trend. The underlying rationale is that after the financial crisis the U.S. economy started to recover and the general market increased (see Fig. 1).

4 Using the distance and cointegration approach, Mikkelsen and Kjærland (2018) find for 100 Oslo Stock Exchange constituents from January 2012 to March 2016 that on average the long position contributes more to the overall profit than the short leg.

167 Jonas Rende

K P P 0

ai rr US ai

0

1998−01−011999−01−012000−01−012001−01−012002−01−012003−01−012004−01−012005−01−012006−01−012007−01−012008−01−012009−01−012010−01−012011−01−012012−01−012013−01−012014−01−012015−01−012016−01−01

Figure 1. Financial performance over time. We plot the development of the financial performance (committed capital) over time for the PBD model, the PCI model and the general market

L L a r Sr 0

ai rr US ai

0

1998−01−011999−01−012000−01−012001−01−012002−01−012003−01−012004−01−012005−01−012006−01−012007−01−012008−01−012009−01−012010−01−012011−01−012012−01−012013−01−012014−01−012015−01−012016−01−01 Figure 2. Financial performance over time of the long leg and the short leg

5.1.4. Exposure to common risk factors To investigate exposure to systematic risk factors, we run three Fama-French type regressions: First, a classic Fama-French 3-factor model (FF3) (Fama and French 1996). The model quantifies if a strategy is subject to general market risk (Market), small minus big capitalization stocks (SMB), and high minus low book- to-market stocks (HML). Second, a variant of the Fama-French 3 Factor model

168 Pairs trading with the persistence-based decomposition model

extended by two additional factors, namely a short-term reversal factor (Reversal) and a momentum factor (Momentum) (Gatev et al. 2006). In line with Clegg and Krauss (2018) we refer to this model as FF3+2. Third, the Fama-French 5-factor model as outlined by Fama and French (2015) (FF5). It augments FF3 by a factor accounting for robust minus weak profitability (RMW) and a second factor captur- ing the difference in investment behavior between conservative and aggressive stocks (CMA)5. Table 5 summarizes the regressions results. A “5” in the label of a factor indicates that this factor belongs to FF5. Standard errors are reported below the corresponding coefficient in parentheses. From our Fama-French analy- sis we draw the following conclusions: First, for all variants, we find statistically and economically significant excess returns of approximately 0.04 percent per day after transaction costs. Second, the market factor is very close to zero and insignificant for all variants. The latter does not come as a surprise, as the PBD pairs trading strategies follow a long-short design. In addition, the SMB, CMA, and RMW factors are insignificant. Fourth, The HML factor and Reversal factor exhibit a positive loading and are both significant, indicating that the PBD results are partially driven by reversal patterns and by selecting pairs of stocks exhibiting a high book-to-market ratio. The highly significant alphas provide evidence that our PBD based pairs trading results are different from a simple reversal strategy. As expected, the loading of the momentum factor is negative. Fifth, with a value of 0.0010 the FF3+2 has the largest adjusted R2. The majority of the increase seems to be driven by the Reversal factor.

5.1.5. Return drivers: parameter and industry profiles In this section, we deep-dive into the average parameter and industry profile difference between all pairs, the 10 percent most profitable and the 1 percent most profitable pairs, to identify influential factors. Table 6 summarizes the results. The most profitable pairs tend to share the following characteristics. First, they exhibit an average level of σM of 0.0030, compared to 0.0009 for average pairs. By contrast, the average level of σR is at 0.0015 for average pairs, while it is at 0.0004 for the most profitable pairs. Hence, we can cautiously infer that profitable pairs show stronger mean-reversion that can potentially be exploited. Thereby, the ra- s tio M is almost constant at a value of four. On the other hand, we observe that s W s the ratio M is increasing with the profitability of the pairs. Second, in terms of s R PBD 2 PBD profitability β , RM and ρ are of minor importance. Third, the industry is an

5 We download all relevant data from the website of Kenneth R. French: http://mba.tuck.dartmouth. edu/pages/faculty/ken.french/data_library.html. We are thankful to Kenneth R. French for providing the data.

169 Jonas Rende

important driver of profitability. A large proportion of the 1 percent most profit- able pairs belongs to industries associated with a similar business model such as utilities (35.05 percent) or financials (14.43 percent). With the exception of the oil & gas industry (1.03 percent), these finding coincide with the idea that the probability for the existence of a relationship between companies is higher when they run similar business models (Gatev et al. 2006). Using the same data set, Rende et al. (2019) report similar findings evaluating the industry composition for those pairs where the PBD model yields the best fit.

Table 5 Systematic sources of risk – restricted backtesting. Shows the exposure to common risk factors for the PBD strategy. *** := p < 0.001, ** := p < 0.01, * := p < 0.05

FF3 FF3+2 FF5 0.0004*** 0.0004*** 0.0004*** Intercept (0.0001) (0.0001) (0.0001) 0.0000 0.0000 0.0000 Market (0.0001) (0.0001) (0.0001) 0.0001 0.0001 – SMB (0.0002) (0.0002) – 0.0004* 0.0005** – HML (0.0002) (0.0002) – – –0.0001 – Momentum – (0.0001) – – 0.0003* – Reversal – (0.0001) – – – 0.0000 SMB5 – – (0.0002) – – 0.0002 HML5 – – (0.0002) – – –0.0002 RMW5 – – (0.0003) – – 0.0004 CMA5 – – (0.0003) R2 0.0010 0.0020 0.0018

170 Pairs trading with the persistence-based decomposition model

Table 5 cont.

Adjusted R2 0.0004 0.0010 0.0007 Observations 4745.00 4745.00 4745.00 RMSE 0.0076 0.0076 0.0076

Table 6 Parameter and industry profiles. Shows parameter and industry profiles across all pairs, the 10 percent most profitable pairs and the 1 percent most profitable pairs. Thereby, PBD PBD PBD σM , σW and σR denote the variances of the stationary AR(1) process, the noise compo- PBD M nent and the random walk, respectively. In addition, ρ is the AR(1) coefficient, 2R is the proportion of mean-reversion attributable to the stationary AR(1) process and βPBD is the spread coefficient

10% most 1% most All pairs protable protable βPBD 0.2371 0.2591 0.2725

PBD σM 0.0009 0.0016 0.0030

PBD σW 0.0002 0.0004 0.0007

PBD σR 0.0015 0.0008 0.0004 s M 4.5000 4.0000 4.2857 sW s M 0.6000 2.0000 7.5000 s R ρPBD 0.8917 0.8165 0.8050

M R2 0.7187 0.7005 0.7162 Basic Materials [%] 13.1813 9.5337 2.0619 Consumer Goods [%] 12.8497 10.3627 7.2165 Consumer Services [%] 12.5078 14.5078 8.2474 Financials [%] 14.2383 12.4352 14.4330 Health Care [%] 7.4611 6.0104 3.0928 Industrials [%] 16.5699 14.3005 8.2474 Oil & Gas [%] 5.7306 5.2850 1.0309 Technology [%] 9.4197 11.1917 18.5567 Telecommunications [%] 0.8290 0.9326 2.0619 Utilities [%] 7.2124 15.4404 35.0515

171 Jonas Rende

5.2. Beyond returns: evaluating backtest assumptions Within this section, we aim to quantify the impact of the execution limitations underlying our backtester, as immediate order execution is often the baseline in the high-frequency pairs trading literature – among others, see Miao (2014) and Stübinger and Bredthauer (2017). For this purpose, we relax our backtest- ing frictions step-by-step and evaluate the performance. In total, we consider six new strategy variants – three per approach. Table 7 summarizes which execution constraints are active in the different strategies variants. The other parameters remain the same as in the fully constraint backtesting framework – see Table 2. First, we forgo on short-selling fees (PBD 1, PCI 1), i.e., we keep the execution gap and volume constraints. Second, we only account for an execution gap (PBD 2, PCI 2). Third, we do not account for any execution limitations (PBD 3, PCI 3).

Table 7 Summary of constraints in the relaxed backtesting framework. Shows the restrictions imposed in the different strategy variants for the PBD (PBD 1, PBD 2 and PBD 3) and the PCI model (PCI 1, PCI 2 and PCI 3)

Constraint PBD 1 PBD 2 PBD 3 PCI 1 PCI 2 PCI 3 Execution gap, 1 1 0 1 1 0 in periods

Y Y Volume con- (νo > 0 (νo > 0 X None None X None None straint (open) & νo > 0) & νo > 0) Y Y Volume con- (νc > 0 (νc > 0 X None None X None None straint (close) & νc > 0) & νc > 0) Short-selling 0 fees [%, p.a.]

Next, we will outline how flexibilizing execution limitations affect risk and returns metrics. Table 8 provides an overview of the performance of the different strategy variants. Relaxing – short-selling fees: For the PBD model, not accounting for short- selling fees (PBD 1) has a minor effect. With a value of 0.04 percent per day after costs the mean return practically does not change. In general, risk metrics slightly decrease or remain on the same level and return metrics as well as risk-return metrics slightly increase. For example, the annualized Sharpe ratio increases from 0.76 to 0.84. The PCI model (PCI 1) benefits more from not charging the short-selling fee, although the absolute impact is small. As an example, con- sider the change in the mean return after transaction costs from –0.06 percent

172 Pairs trading with the persistence-based decomposition model

to –0.05 percent per day. The reason for the latter is that the average time pairs are open per round-trip trade is smaller than 1 for both strategies and therefore often solely the minimum fee for one day is charged. Thus, for the PCI model, the impact on trading results is larger because the average number of round-trip trades per pair is much larger. Note that both the average number of round-trip tra- des per pair, as well as the average time pairs are open per round-trip trade, almost do not change (not shown).

Table 8 Daily and annualized risk-return metrics – relaxed backtesting. Panel A depicts daily return characteristics, panel B depicts risk and panel C annualized risk-return metrics for different strategy variants of the PBD model (PBD 1, PBD 2 and PBD 3), the PCI model (PCI 1, PCI 2 and PCI 3) and the general market (MKT)

PBD 1 PBD 2 PBD 3 PCI 1 PCI 2 PCI 3 MKT A Mean return 0.0004 0.0016 0.0035 -0.0005 0.0011 0.0027 0.0002 Standard Error 0.0001 0.0002 0.0003 0.0002 0.0003 0.0002 0.0002 (N-W) t-Statistic 3.3471 8.9109 13.2533 -2.6337 4.3191 13.4990 1.5479 Minimum -0.0583 -0.0585 -0.0599 -0.0880 -0.0914 -0.0832 -0.0903 25% Quantile -0.0031 -0.0027 -0.0018 -0.0051 -0.0043 -0.0020 -0.0055 Median 0.0001 0.0005 0.0014 -0.0003 0.0003 0.0017 0.0005 75% Quantile 0.0035 0.0044 0.0059 0.0043 0.0055 0.0067 0.0061 Maximum 0.0860 0.0864 0.1303 0.0795 0.0909 0.1045 0.1158 Share > 0 0.5598 0.5756 0.6105 0.4755 0.5172 0.6015 0.5301 Standard dev. 0.0076 0.0088 0.0108 0.0103 0.0119 0.0110 0.0124 Skewness 0.6117 1.5260 2.4645 -0.3129 0.8838 0.8623 -0.0272 Kurtosis 10.4667 11.2909 14.7676 7.5936 9.0918 11.9495 7.7488 B Hist. VaR 1% -0.0312 -0.0245 -0.0146 -0.0449 -0.0408 -0.0434 -0.0515 Hist. CVaR 1% -0.0281 -0.0267 -0.0238 -0.0432 -0.0420 -0.0391 -0.0483 Hist. VaR 5% -0.0091 -0.0067 -0.0022 -0.0168 -0.0132 -0.0099 -0.0184 Hist. CVaR 5% -0.0161 -0.0148 -0.0130 -0.0260 -0.0242 -0.0212 -0.0293 Maximum 0.4588 0.4119 0.2366 0.9778 0.9345 0.3847 0.5678 drawdown

173 Jonas Rende

Table 8 cont.

PBD 1 PBD 2 PBD 3 PCI 1 PCI 2 PCI 3 MKT C Return p.a. 0.1004 0.4689 1.3860 -0.1309 0.2946 0.9397 0.0438 Standard dev. p.a. 0.1201 0.1391 0.1709 0.1640 0.1894 0.1739 0.1974 Downside dev p.a. 0.0774 0.0713 0.0611 0.1225 0.1139 0.0954 0.1396 Sharpe ratio p.a. 0.8362 3.3701 8.1089 -0.7984 1.5555 5.4027 0.2217 Sortino ratio p.a. 1.2973 6.5781 22.6789 -1.0686 2.5852 9.8540 0.3135

Relaxing – short-selling fees and volume constraints: Naturally, the question arises as to how to trade in the backtesting engine when there might be zero trad- ing volume available at a point in time. The problem originates from the fact that the econometric approaches applied in trading applications demand equidistant observations – a requirement which is often not fulfilled in high-frequency stock price data sets. To address the equidistance issue prices, are usually forward filled and missing volume information are replaced with zeros. Thus, to account for the latter in the restricted backtesting framework, we impose the constraint that the trading volume for both stocks have to be greater than zero. Solely accounting for an execution gap leads to an increase in daily mean returns after transaction costs of 300 percent resulting in an annualized return of 46.89 percent for the PBD model (PBD 2). A comparison of PCI 1 and PCI 2 yields that daily returns after costs is now positive and highly significant with a value of 0.11 percent. This translates to an annualized return of 29.36 percent. We observe that for both models standard deviation (daily and annualized) and kurtosis slightly increase, but besides that all distributional characteristics as well as risk and return metrics exhibit favorable changes compared to PBD 1 and PCI 1, respectively. Thereby, the PBD model is outperforming the PCI model in every metric. Relaxing – short-selling fees, volume constraints and execution gap: Finally, we do not consider any execution restrictions. As a consequence annual returns increase to 138.60 percent after costs for the PBD model (PBD 3) and to 93.97 percent for the PCI model (PBD 3). If we contrast PBD 3 with PBD 2and PCI 3 with PCI 2 we observe significant improvements in terms of return, risk and, risk and return metrics with the exception of standard deviations and kurtosis which is as expected. As in the other cases PBD is superior. From our analysis we can draw the following conclusions: First, the PBD model is superior to the benchmark model for every strategy variant. Second, short-selling fees have a minor impact on returns. Third, the timing of order execution in light of volume constraints and a general execution gap is crucial

174 Pairs trading with the persistence-based decomposition model

for the success of a strategy. Thereby, the volume constraints contribute more to the relative increase in returns compared to the waiting rule in our setting. The sharp increase of returns after not accounting for an execution gap and volume constraints indicates that the driving force behind large high-frequency delay-zero alpha returns is the bid-ask bounce. The latter finding for the effect of a waiting rule is not new: Among others, Gatev et al. (2006) report decreasing returns after accounting for an execution gap on daily data. Bowen et al. (2010) and Bowen and Hutchinson (2016) report similar patterns for high-frequency data. To the best of our knowledge, no high-frequency pairs trading study quantifies the effect of volume constraints. Thus, not accounting for an execution gap and volume constraints might result in significantly upward-biased returns. As in our case, annual returns increase from about 10 percent after costs to about 138 percent for the PBD model if we do not account for any execution gap.

6. Conclusion

With this manuscript, we have made three contributions to the high-frequency pairs trading literature. First, we apply the persistence-based decomposition model introduced by Rende et al. (2019) to S&P 500 minute-by-minute starting from 1998-01-02 and ending 2016-11-18. The PBD model decomposes a spread into the sum of a random walk (infinite persistence), a stationary AR(1) process (finite persistence) and noise (no persistence) to account for different levels of persistence. Thereby, our backtester accounts for (i) bid-ask bounce typically embedded in high-frequency data (Aït-Sahalia and Yu 2009); (ii) signal process- ing; (iii) entry and exit volume constraints, i.e., we solely trade if the available trading volume is larger than zero for both stocks, otherwise we postpone the trade until the condition is met; (iv) short-selling fees and; (v) conservative transaction costs of 0.20 percent per full turn per pair. The PBD model yields an annual return after costs of 9.16 percent compared to a value of –14.41 percent for the partial cointegration (PCI) (Clegg and Krauss 2018) and 4.38 percent for a simple buy-and-hold strategy. The PBD model is superior in terms of risk and return metrics. Second, we shed light onto the model mechanics and the development of return over time. We find that the volume constraint is violated in 22.04 percent of all trades, giving a first indication that volume constraints are not negligible. Evaluating the performance over time yields that the PBD model shows an outstanding performance in bear markets such as the financial crisis. A deep-dive analysis points out that during turbulent market phases the short-leg contributes significantly more to the profitability of the strategy than the long leg. While this finding is not new for pairs trading strategies applied

175 Jonas Rende

daily (Bowen and Hutchinson 2016), we are the first high-frequency pairs trad- ing study reporting this effect for the U.S. market. Fama-French type analyses provide evidence that our returns can only partially be explained by exposure to systematic risk. Concerning parameter and industry profiles of the most profit- able pairs we conclude that the most influential drivers are industry affiliation and a dominance of exploitable mean-reversion over the pronouncedness of the random walk. Specifically, we find that among the most profitable pairs, the share of industries with a similar business model is very high. The third con- tribution is to quantify the effect of not accounting for execution limitations. If no execution limitations are imposed during the backtesting process, the PBD model yields annual returns after costs of 138.6 percent. In addition, all risk and return metrics improve. These results indicate that intraday pairs trading returns are highly sensitive to execution limitations.

References

[1] Aït-t-Sahalia, Y., Yu, J. (2009) ‘High frequency market microstructure noise estimates and liquidity measures’, The Annals of Applied Statistics, vol. 3, pp. 422–457, doi: 10.1214/08-AOAS200. [2] Avellaneda, M. and Lee, J.H. (2010) ‘Statistical Arbitrage in the U.S. Equities Market’, Quantitative Finance, vol. 10, pp. 761–782. [3] Bertram, W.K. (2010) ‘Analytic solutions for optimal statistical arbitrage trading’, Physica A: Statistical Mechanics and its Applications, vol. 389, pp. 2234–2243, [Online], Available: https://www.jstor. org/stable/pdf/2953682. pdf. [4] Bowen, D., Hutchinson, M.C. and O’Sullivan, N. (2010) ‘High Frequency Equity Pairs Trading: Transaction Costs, Speed of Execution and Patterns in Returns’, Journal of Trading, vol. 5, pp. 31–38. [5] Bowen, D.A. and Hutchinson, M.C. (2016) ‘Pairs trading in the UK eq- uity market: risk and return’, The European Journal of Finance, vol. 22, pp. 1363–1387. [6] Caldeira, J.F. and Moura, G.V. (2013) ‘Selection of a Portfolio of Pairs Based on Cointegration: A Statistical Arbitrage Strategy’, Brazilian Review of Fi- nance, pp. 49–80. [7] Chen, H., Chen, S., Chen, Z. and Li, F. (2017) ‘Empirical Investigation of an Equity Pairs Trading Strategy’, Management Science, vol. 65. [8] Clegg, M. (2015), ‘Modeling Time Series with Both Permanent and Transient Components Using the Partially Autoregressive Model’, SSRN Electronic Journal, doi: 10.2139/ssrn.2556957. [9] Clegg, M. and Krauss, C. (2018) ‘Pairs trading with partial cointegration’, Quantitative Finance, vol. 18, pp. 121–138, doi:10.1080/14697688.2017.1 370122.

176 Pairs trading with the persistence-based decomposition model

[10] Clegg, M., Krauss, C. and Rende, J. (2018) ‘partialCI: Partial Cointegration’, [Online], Available: https://CRAN.R-project.org/package=partialCI. [11] Cont, R. (2001) ‘Empirical properties of asset returns: Stylized facts and statistical issues’, Quantitative Finance, vol. 1, 223–236. [12] Do, B. and Faff, R. (2010) ‘Does Simple Pairs Trading Still Work?’, Financial Analysts Journal, vol. 66, pp. 83–95. [13] Do, B. and Faff, R. (2012) ‘Are pairs trading profits robust to trading costs?’, Journal of Financial Research, vol. 35, pp. 261–287. [14] Dowle, M. and Srinivasan, A. (2018) ‘data.table: Extension of «data.frame»’, [Online], Available: https://CRAN.R-project.org/package=data.table. [15] Dunis, C.L., Giorgioni, G., Laws, J. and Rudy, J. (2010) ‘Statistical Arbitrage and High-Frequency Data with an Application to Eurostoxx 50 Equities’, Liverpool Business School, Working paper. [16] Endres, S. and Stübinger, J. (2017) ‘Optimal trading strategies for Lévy- driven Ornstein–Uhlenbeck processes’, FAU Discussion Papers in Economics, Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics. [17] Engle, R.F. and Granger, C.W.J. (1987) ‘Co-Integration and error correction: representation, estimation, and testing’, Econometrica, vol. 55, pp. 251–276, doi:10.2307/1913236. [18] Fama, E.F. and French, K.R. (1996) ‘Multifactor explanations of Asset Pricing Anomalies’, The Journal of Finance, vol. 51, pp. 55–84. [19] Fama, E.F. and French, K.R. (2015) ‘A five-factor asset pricing model’,Journal of Financial Economics, vol. 116, pp. 1–22. [20] Fischer, B. and Pau, G. (2017) ‘rhdf5: HDF5 interface to R’, [Online], Avail- able: https://www.bioconductor.org/packages/release/bioc/vignettes/rhdf5/ inst/doc/rhdf5.html. [21] Gatev, E., Goetzmann, W.N. and Rouwenhorst, K.G. (2006) ‘Pairs Trading: Performance of a Relative-Value Arbitrage Rule’, Review of Financial Stud- ies, vol. 19, pp. 797–827. [22] Huck, N. (2015) ‘Pairs trading: does volatility timing matter?’, Applied Eco- nomics, vol. 47, pp. 6239–6256. [23] Izrailev, S. (2014) ‘tictoc: Functions for timing R scripts, as well as imple- mentations of Stack and List structures’, [Online], Available: https://CRAN. R-project.org/package=tictoc. [24] Kakushadze, Z. (2016) ‘101 Formulaic Alphas’, Wilmott, vol. 2016, pp. 72–81, doi:10.1002/wilm.10525. [25] Kim, K. (2011) ‘Performance Analysis of Pairs Trading Strategy Utilizing High Frequency Data with an Application to KOSPI 100 Equities’, SSRN Electronic Journal, doi:10.2139/ ssrn.1913707. [26] Kishore, V. (2012) ‘Optimizing Pairs Trading of US Equities in a High Fre- quency Setting’, [Online], Available: https://quantlabs.net/blog/wp-content/ uploads/2015/04/Kishore-Optimize-Pair-Trading-HFT.pdf.

177 Jonas Rende

[27] Krauss, C. (2017) ‘Statistical arbitrage pairs trading strategies: Review and outlook’, Journal of Economic Surveys, vol. 31, pp. 513–545. [28] Krauss, C., Anh Do, X. and Huck, N. (2017) ‘Deep neural networks, gradient- boosted trees, random forests: statistical arbitrage on the S & P 500’, European Journal of Operational Research, vol. 259, pp. 689–702. [29] Liu, B., Chang, L.B. and Geman, H. (2017) ‘Intraday pairs trading strategies on high frequency data: the case of oil companies’, Quantitative Finance, vol. 17, pp. 87–100, doi: 10.1080/14697688. 2016.1184304. [30] Luethi, D., Erb, P. and Otziger, S. (2018) ‘FKF: Fast Kalman Filter’, [Online], Available: https://CRAN.R-project.org/package=FKF. [31] Miao, G.J. (2014) ‘High Frequency and Dynamic Pairs Trading Based on Statistical Arbitrage Using a Two-Stage Correlation and Cointegration Ap- proach’, International Journal of Economics and Finance, vol. 6, pp. 96–110. [32] Mikkelsen, A., Kjærland, F. (2018) ‘High-frequency Pairs Trading on a Small Stock Exchange’, International Journal of Economics and Financial Is- sues, vol. 8, pp. 78–88, [Online], Available: https://brage.bibsys.no/xmlui/ bitstream/11250/ 2574680/5/Mikkelsen.pdf. [33] Nath, P. (2003) ‘High frequency pairs trading with US treasury securities: Risks and rewards for hedge funds’, [Online], Available: https://papers.ssrn. com/sol3/papers.cfm?abstract_id=565441. [34] Papadakis, G. and Wysocki, P. (2007) ‘Pairs Trading and Accounting Informa- tion’. Boston University and MIT Working Paper. [35] Peterson, B.G., Carl, P. (2018) ‘Performance Analytics: econometric tools for performance and risk analysis’, [Online], Available: https://CRAN.R-project. org/package=PerformanceAnalytics. [36] Poterba, J.M. and Summers, L.H. (1988) ‘Mean reversion in stock prices’, Journal of Financial Economics, vol. 22, pp. 27–59, doi:10.1016/0304- 405X(88)90021-9. [37] QuantQuote (2012) ‘Minute Market Data’, [Online], Available: https://quant- quote.com/docs/QuantQuote_Minute.pdf. [38] QuantQuote (2016) ‘QuantQuote market data and software’, [Online], Avail- able: URL: https://www.quantquote.com/. [39] R Development Core Team (2018) R: A Language and Environment for Sta- tistical Computing. Vienna, Austria. [40] Rad, H., Low, R.K.Y. and Faff, R. (2016). The Profitability of Pairs Trading Strategies: Distance, Cointegration and Copula Methods, Quantitative Fi- nance, vol. 16, pp. 1541–1558. [41] Rende, J., Krauss, C. and Clegg, M. (2019) ‘The persistence-based decomposi- tion (PBD) time series model: theory and empirical application’, vol. 01/2019 of FAU Discussion Papers in Economics.

178 Pairs trading with the persistence-based decomposition model

[42] Ryan, J.A. and Ulrich, J.M. (2018) ‘xts: Extensible Time Series’, [Online], Available: https://CRAN.R-project.org/package=xts. [43] Schwarz, G. (1978) ‘Estimating the Dimension of a Model’, The Annals of Statistics, vol. 6, pp. 461–464, [Online], Available: https://projecteuclid.org/ download/pdf_1/euclid.aos/1176344136, doi:10. 1214/aos/1176344136. [44] Sharpe, W.F. (1994) ‘The Sharpe Ratio’, The Journal of Portfolio Manage- ment, vol. 21, pp. 49–58, doi:10.3905/jpm.1994.409501. [45] S&P 500 Dow Jones Indices (2015) ‘Equity S&P 500’, [Online], Available: http://us.spindices.com/ indices/equity/sp-500. [46] Stübinger, J. and Bredthauer, J. (2017) ‘Statistical Arbitrage Pairs Trading with High-frequency Data’, International Journal of Economics and Financial Issues, vol. 7, pp. 650–662, [Online], Available: https://econjournals.com/ index.php/ijefi/article/download/5127/pdf. [47] Stübinger, J. and Endres, S. (2018) ‘Pairs trading with a mean-reverting jump- diffusion model on high-frequency data’, Quantitative Finance, vol. 18, pp. 1735–1751. [48] Stübinger, J., Mangold, B. and Krauss, C. (2018) ‘Statistical arbitrage with vine copulas’, Quantitative Finance, vol. 18, pp. 1831–1849, doi:10.1080/ 14697688.2018.1438642. [49] The HDF Group (2010) ‘Hierarchical data format version 5’, [Online], Avail- able: http://www.hdfgroup.org/HDF5. [50] Vidyamurthy, G. (2004) Pairs Trading: Quantitative Methods and Analysis. Hoboken: Wiley Finance. John Wiley & Sons Inc.

Summaries

Sylvia Endres: Review of stochastic differential equations in statistical arbi- trage pairs trading . Managerial Economics 2019, vol. 20, No. 2 JEL classification: G11, G12, G14

Keywords: statistical arbitrage, pairs trading, stochastic models, mean-reversion, stochastic differential equations

The use of stochastic differential equations offers great advantages for statistical arbitrage pairs trading. In particular, it allows the selection of pairs with desirable properties, e.g., strong mean- reversion, and it renders traditional rules of thumb for trading unnecessary. This study provides an exhaustive survey dedicated to this field by systematically classifying the large body of litera- ture and revealing potential gaps in research. From a total of more than 80 relevant references, five main strands of stochastic spread models are identified, covering the ‘Ornstein–Uhlenbeck model’, ‘extended Ornstein–Uhlenbeck models’, ‘advanced mean-reverting diffusion models’, ‘diffusion models with a non-stationary component’, and ‘other models’. Along these five main categories of stochastic models, we shed light on the underlying mathematics, hereby reveal- ing advantages and limitations for pairs trading. Based on this, the works of each category are further surveyed along the employed statistical arbitrage frameworks, i.e., analytic and dynamic programming approaches. Finally, the main findings are summarized and promising directions fur future research are indicated.

Jessica Hastenteufel, Mareike Staub: Current and future challenges of family businesses . Managerial Economics 2019, vol. 20, No. 2

JEL classification: L21, L26, M12, M14

Keywords: family business, family business governance, business succession

Family businesses are an important part of every economy. They are characterized by long tradi- tions that combine aspects such as trust and reliability, as well as by features such as innovative- ness, foresight, long-term focus and flexibility. Both family businesses and the entrepreneurial families themselves do have some weaknesses and face current challenges like digitization, internationalization and demographic change. These issues must be kept in mind in order to constantly develop appropriate solutions that will help them survive and thrive in the market. Moreover, the high relevance of the family in a family business is associated with opportunities – for example, when a family strategy with clear values, roles and goals is defined, and a so- called family business governance is developed.

181 Summaries

Katarzyna Kowalska-Jarnot: Selected methods of studying a college’s image . Managerial Economics 2019, vol. 20, No. 2 JEL classification: D20, M31

Positive image and reputation building are important strategies in developing the permanent recognition of a college in the education market. Although it seems challenging to measure image and reputation, they are an important opportunity to gain a competitive advantage due to the fact that they are unique, hard to imitate and increase a college’s chances of attracting more students. Image management requires systematic marketing research. This article is the author’s proposition for a set of image research methods that can be used by colleges.

Keywords: organization identity, image research, reputation

Jonas Rende: Pairs trading with the persistence-based decomposition model . Managerial Economics 2019, vol. 20, No. 2 JEL classification: G11, G12, G14

Keywords: persistence, high-frequency, pairs trading, permanent components, transient com- ponents, noise

Recently, the persistence-based decomposition (PBD) model has been introduced to the scientific community by Rende et al. (2019). It decomposes a spread time series between two securities into three components capturing infinite, finite, and no shock persistence. The authors provide empirical evidence that the model adopts well to noisy high-frequency data in terms of model fitting and prediction. We put the PBD model to test on a large-scale high-frequency pairs trad- ing application, using S&P 500 minute-by-minute data from 1998 to 2016. After accounting for execution limitations (waiting rule, volume constraints, and short-selling fees) the PBD model yields statistically significant and economically meaningful annual returns after transaction costs of 9.16 percent. These returns can only partially be explained by the exposure to common risk. In addition, the model is superior in terms of risk-return metrics. The model performs very well in bear markets. We quantify the impact of execution limitations on risk and return measures by relaxing backtesting restrictions step-by-step. If no restrictions are imposed, we find annual returns after costs of 138.6 percent. Instruction for authors

Before submitting the paper we encourage the authors to use English lan- guage editing support.

Papers which are to be published in Managerial Economics should be pre- pared according to the following guidelines.

All illustrations, figures, and tables are placed within the text at the appropri- ate points, rather than at the end.

Title page should include a footnote, giving the author(s) affiliation(s) (in- cluding postal and e-mail addresses of all authors).

Figures must be prepared in a form suitable for direct reproduction. Digital artwork at least 300 dpi resolution is accepted. Photographs, on glossy paper (9 by 13 cm or larger), should display sharp contrast. Figures, tables and photographs should be numbered according to their reference in text.

Illustrations should be edited in CorelDraw (*.CDR), DrawPerfect (*.WPG) or in any other vector graphics form e.g. HPGL, Encapsulated PostScript (*.EPS), Computer Graphics Metafile *.CGM) or bitmaps (*.TIF, *.PCX).

Mathematical equations within the text should be written in separate lines, numbered consecutively (numbers within round brackets) on the right-hand side. Greek characters must be written out clearly.

Summary and 3–5 keywords should be submitted in separate file containing the name of the author, title of the paper with the heading “Summary”.

Authors using Word are requested to employ, as far as possible, text form of mathematical symbols leaving graphic form for the equations longer than single line.

183 Instruction for authors

Reference style

In general, the authors should use the Harvard style of referencing. Refer- ences to literature within the text should be given in the form: the name of the author(s) and the year of publication (in parentheses), e.g. “Smith (1990) under- lines...”, “As shown in Smith (1990)...”. In case of more than two authors of the cited publication the “et al.” shortcut should be used. Lists of references should be written in alphabetical-chronological order, numbered and follow the rules:

• JOURNAL ARTICLE

Muller, V. (1994) ‘Trapped in the body: Transsexualism, the law, sexual identity’, The Australian Feminist Law Journal, vol. 3, August, pp. 103–107.

• BOOKS

Book with one author

Adair, J. (1988) Effective time management: How to save time and spend it wisely, London: Pan Books.

Book with two authors

McCarthy, P. and Hatcher, C. (1996) Speaking persuasively: Making the most of your presentations, Sydney: Allen and Unwin.

Book with three or more authors

Fisher, R., Ury, W. and Patton, B. (1991) Getting to yes: Negotiating an agreement without giving in, 2nd edition, London: Century Business.

Book – second or later edition

Barnes, R. (1995) Successful study for degrees, 2nd edition, London: Routledge.

Book by same author in the same year

Napier, A. (1993a) Fatal storm, Sydney: Allen and Unwin. Napier, A. (1993b) Survival at sea, Sydney: Allen and Unwin.

184 Instruction for authors

Book with an editor

Danaher, P. (ed.) (1998) Beyond the ferris wheel, Rockhampton: CQU Press.

A chapter in a book

Byrne, J. (1995) ‘Disabilities in tertiary education’, in Rowan, L. and McNamee, J. (ed.) Voices of a Margin, Rockhampton: CQU Press.

• WORLD WIDE WEB PAGE

Young, C. (2001) English Heritage position statement on the Valletta Convention, [Online], Available: http://www.archaeol.freeuk.com/EHPostionStatement.htm [24 Aug 2001].

• CONFERENCE PAPERS

Hart, G., Albrecht, M., Bull, R. and Marshall, L. (1992) ‘Peer consultation: A pro- fessional development opportunity for nurses employed in rural settings’, Infront Outback – Conference Proceedings, Australian Rural Health Confer- ence, Toowoomba, pp. 143–148.

• NEWSPAPER ARTICLES

Cumming, F. (1999) ‘Tax-free savings push’, Sunday Mail, 4 April, p. 1.

All the items cited in the main text, and no other items, must be placed in the list of references.

Authors should include 2–3 JEL codes with manuscript during submission. For more details on the JEL classification system CLICK HERE.

Information about the journal and the deadlines for submitting articles for next issues are presented at

http://www.managerial.zarz.agh.edu.pl

IMPORTANT NOTE: Instead the traditional e-mail-based procedure, the authors are encouraged to use the new Online Submission System that has just been activated.

185

Double blind peer review procedure

1. In order to assess a quality of submitted publication the Editorial Board consults at least two outside referees (not affiliated to any of the authors’ institutions) which are recognized experts in the specific field.

2. At least one of the referees must represent a foreign institution (i.e. an insti- tution located in other country than the home institution of each author).

3. The journal uses double blind peer review policy, i.e. neither the author nor the referee knows the identity of the other. In addition, each referee signs a declaration of no conflict of interests, whereby the conflict of interest is defined as a direct personal relationship (kinship to the second degree, legal relationships, marriage) or a professional scientific cooperation between the referee and the author which took place at least once in the past two years preceding the year of accepting the invitation to review.

4. Only written referee reports are considered (journal does not accept face- to-face or phone-call- based reports). Each report should clearly express the referee’s final recommendation (i.e. whether the article should be accepted for publication, revised or rejected). The referees are kindly requested to fill the review form which can be found in “For reviewers” section. In general, the referees are asked to:

– assess: • the scientific importance of the submission’s topic, • the quality of research;

– verity whether: • the Abstract is concise and informative, • the facts and interpretations are satisfactorily separated in the text, • the interpretations and conclusions follow from the data, • the length and structure of the paper is appropriate, • the paper can be shortened without loss of quality,

187 Double blind peer review procedure

• all the tables and figures are necessary, • the diagrams and photographs are of good quality, • there are all essential figures that should be prepared, • all the references are exact, • the manuscript requires proof reading by native speaker, • there is sufficient attention given to previous research.

5. The names of the referees of particular articles are classified.

6. Once a year the journal publishes the complete list of referees.