arXiv:2106.11484v1 [q-fin.PM] 22 Jun 2021 * e-mail: 2 e-mail: e-mail: 1 orsodn author. Corresponding etrlprflootmzto yjdcosslcinof selection judicious by optimization portfolio Sectoral nttt fMngmn ehooy hzaa-001 India. Ghaziabad-201001, Technology, Management of Institute Roor Technology of Institute Indian Mathematics, of Department iiain eododrsohsi oiac,In-sample dominance, stochastic order Second timization, hr aisaecrflyslce n meddquantitat embedded and impo selected process. the carefully highlights are This ratios where periods. ratio m othe out-of-sample Rachev 500 all all ratio, BSE almost Sortino S&P outperforms VaR, for CVaR, PCA-SPO(B) months deviation, 24 strategy downside and sch proposed 12 window 9, the rolling 6, a that 3, using of 2020 line ass March time is to of-sample strategies 2014 rat April proposed financial from of without years and performance with Empirical model, SSD study. nominal PCA-SPO( step as one one and latter the and formed PCA-SPO(A) strategy as The termed portfolio. is optimal ratios an communalities cri build their SSD to com the of utilized integrating the M basis model from to the (SPO) one 2004 Optimization on ways, Portfolio April category two from each in years from sector 10 other each of from period ratios a V dominant over and wise rep Profitability, sector Solvency, sector Liquidity, plied each namely W to ratios, corresponding investment. of ratios optimal financial final known the well derive incorpor to model criteria r optimization dominance financial portfolio dominant the out firs use filter by thereafter, to it wise incorporate sector to analysis attempt we ponent paper, this In challenge. mt hra [email protected] Sharma: Amita [email protected] Gupta: K. [email protected] Shiv Dhingra: Vrinda Keywords h eut bandfo h rpsdsrtge r compa are strategies proposed the from obtained results The mode optimization portfolio in investment value Embedding rnaDhingra Vrinda rnia opnn nlss iaca ais Sectoral Ratios, Financial analysis, component Principal nnilrto i PCA via ratios financial 1 mt Sharma Amita , Abstract 1 2 hvK Gupta K. Shiv , e-467 India. kee-247667, tn eododrstochastic order second ating n u-fsml analysis out-of-sample and lain C ste ap- then is PCA aluation. tneo au investment value of rtance sn h omrextracted former the using com- principal employing t se vrtepro f6 of period the over essed vl nprfloselection portfolio in ively to rmec etrand sector each from atios ei ncntansi then is constraints in teria n TR aisover ratios STARR and , eetn orcategories four resenting h w tpSectoral step two The . osdrattlo 11 of total a consider e e ihteSOmodel SPO the with red m ihvrigout- varying with eme oesi em of terms in models r o o computational for ios shsawy ena been always has ls rh21 oextract to 2014 arch oetslto and solution ponent B). re.W observe We arket. otoi op- portfolio 1* 1 Introduction

Benjamin Graham (Graham and Dodd 1934; Graham 1949) and Warren Buffet are among the famous personalities who emphasized on the importance of using fundamental analysis into investment, also known as value investment. The aim of fundamental analysis is to pick fundamentally strong assets to produce robust performance of a portfolio over the investment period. Financial analysis (FA) which is a subset of fundamental analysis, focuses on financial ratios (FRs) to identify the strengths and weaknesses of a company in terms of its financial performance e.g. sales, earnings, debt, liquidity, and liabilities among others. Financial ratio (FR) is the ratio of two appropriately chosen numerical values of the financial entities of a company to figure out the relative performance. For example, current ratio collates current assets to current liabilities, gauging a firm’s ability to meet its short term obligations whereas net profit margin is the net profit earned by a company as a percentage of its total revenue, highlighting overall growth of the firm. The broad categories into which the FRs can be classified are: profitability, liquidity, solvency, and valuation. The importance and usefulness of FRs has been repeatedly demonstrated in various stud- ies (Beaver 1966; Chen and Shimerda 1981; Delen et al. 2013; Arkan 2016). Researchers, accountants, and financial analysts have been using financial ratios for the assessment of , forecasting failures, predicting future profits, and liquidity positions among others. The im- portance of FRs is not limited to forecasting and predictive analysis, it has also been adopted in portfolio selection. Several researchers (Edirisinghe and Zhang 2007; Samaras et al. 2008) have incorporated the FRs by employing MCDM techniques to filter out elite stocks and then construct an optimal portfolio. Yu et al. (2009) and Silva et al. (2014) used several fundamental indicators like ROE (return on equity), Debt-ratio, and P/E ratio jointly with technical indicators to generate optimal portfolios using evolutionary algorithms. On the other hand, few others (Sharma and Mehra 2017; Tarczy´nski and Tarczy´nska-Luniewska 2017) emphasized on the importance of sector-wise comparison of the assets when FRs is one of the selection criteria. Considering the movement in share price alone is a naive way to evaluate the performance of an asset. It is therefore advisable to invest in a company which is doing good in terms of its financial entries or FRs. However, when and how to include the FRs quantitatively in portfolio selection process impacts its performance significantly. For instance, it is well- known that the sectors behave differently owing to their varied business functionalities and operational structures and therefore, comparison between two companies belonging to differ- ent sectors 1 on the basis of same FRs is not wise. A very few studies focus on this important step. The authors (Sharma and Mehra 2017) attempted to resolve this issue by proposing a two-step investment scheme, termed as SPO in their paper, by first solving the second order stochastic dominance (SSD) portfolio optimization model for each sector separately on the basis of their FR values and then a final optimal portfolio is obtained by combining the weights derived from step 1 along with their mean return performance. They used the SSD criteria from the benchmark index in constraints to generate portfolios in both the steps. SPO, though having several advantages over the classical SSD model, incorporates value

1In financial markets, sectors are the groups of stocks classified according to their similar business func- tionality and operations

2 investment in the form of four fixed FRs common for all sectors. However, it is well known that ratios that are found to be critical for one sector may not be so for another sector because companies in different sectors have different market and capital structures. For instance, the banking and finance sector rely on credit calculations, therefore, liquidity and solvency ratios are more important to this sector whereas the construction industry being capital intensive relies hugely on equipment to build projects and thus profitability ratios are most valuable among construction based firms. Hence, the choice of ratio(s) in each sector is crucial in the decision making process and thus considering fixed set of FRs for all sectors unable to reflect the importance of value investment. To address this issue, we propose to apply Principal Component Analysis (PCA) sector wise to extract most dominant financial ratios for each sector and thereafter use the SPO two step scheme to generate final optimal portfolio. PCA is a non-parametric statistical tool that transforms a set of variables into few factors called principal components, retaining maximum information of the original variables. We offer to extract four dominant ratios explaining atleast 80% of the total variation in each sector from PCA in two ways, first by fetching the four ratios corresponding to each principal component and second by selecting from each of the four categories of ratios on the basis of their communalities. We then use the SPO model to derive final optimal portfolios constructed from both types of extractions. We named the proposed strategy PCA-SPO(A) in case of deriving the overall four dominant ratios and PCA-SPO(B) when each ratio becomes the dominant ratio in its category. To widen the comparative analysis, we also solve classical SSD (Dentcheva and Ruszczy´nski 2003) model with FRs (named as F-SSD) and without FRs (named as SSD) to reflect the importance of correct usage of value investment in portfolio selection. While the SSD model considers only mean return values in its objective function, the F-SSD model combines the mean return values with the FRs in the objective function. PCA is being utilized when all assets are taken together to select the four dominant ratios in case of the F-SSD model. Also, both the SSD and F-SSD models are optimized on all assets at once. Therefore, an im- provement of the proposed strategies over F-SSD or SSD reflects the importance of utilizing FRs sector wise in asset allocation. For the purpose of computational analysis, we consider historical data of the FRs and the closing prices for the stocks listed in the 6 sectors of the S&P BSE 500 market, namely Information Technology (IT), Consumer Durable (CD), Energy, Banking and Finance, Fast Moving Consumer Goods (FMCG), and Healthcare. We chose a set of total 11 FRs repre- senting four important categories, Profitability, Liquidity, Solvency and Valuation for each of these 6 sectors for a period of 10 years from April 2004 to March 2014 to filter out domi- nant ratios using PCA. We then solve all the models under the in-sample and out-of-sample framework over the period of 6 years from April 2014 to March 2020 using a rolling window scheme. To make the comparative study robust with respect to the investment period, the out-of- sample analysis is varied over different time periods ranging from 3 months to 24 months. We solve a total of five optimization models, SPO, SSD, F-SSD and two models corresponding to the proposed strategies PCA-SPO(A) and PCA-SPO(B). The performance of the portfolios generated from these models is measured and analyzed on many fronts, including mean return, several measures and performance ratios. We consider, Conditional (CVaR) at 95% and 97%, Downside Risk, Value at risk (VaR) at 95% and 97% as risk

3 measures and Sharpe, Sortino, STARR and Rachev as performance measures. We find that the portfolios obtained from the proposed strategy PCA-SPO(B) outper- forms all other models in terms of risk-reward profile by achieving higher values of Sortino, Sharpe, Rachev and STARR ratios and lower risk values in terms of DD, VaR and CVaR in almost all the out-of-sample periods whereas PCA-SPO(A) achieves the highest mean return in almost all the periods but fails to impress much on other parameters. It shows the benefit of considering ratios from each category in the process of portfolio construction. We also note down that the F-SSD model suffers with the highest risk and under performs in other measures in comparison to other models, indicating that mere consideration of finan- cial ratios in portfolio selection may not be wise and its correct implementation in form of sector-wise comparison is even more crucial. The rest of the paper is structured as follows. Section 2 presents a brief review of literature on SSD, FA and PCA in portfolio selection. Section 3 describes the SPO model. Section 4 elucidates the proposed strategy. Section 5 communicates the sample data, PCA results, the in-sample and out-of-sample performance analysis. Section 6 concludes the paper.

2 Literature Review

We now present a brief survey of literature on SSD, PCA and FA in portfolio selection process.

2.1 Second order Stochastic Dominance Stochastic Dominance (SD) stems from the theory of majorization (Marshall and Olkin 1979) that primarily ranks two real-valued random vectors. SD is popularly known for its relationship with the different classes of utility functions (Levy 1992). Of utmost impor- tance to economics are SD of order one (FSD; Quirk and Saposnik 1962), order two (SSD; Hadar and Russell 1969) and order three (TSD; Whitmore 1970), for they respectively res- onate the behaviour of the class of rational, rational risk-averse, and rational risk-averse and ruin-averse investors. Among the three orders, the SSD is widely used in portfolio opti- mization as it describes the preference of all rational, risk-averse investors, characterized by non-decreasing concave utility functions. The traditional mean-risk formulations are unable to extract much information from the return distribution, and thus some researchers recommended the incorporation of ei- ther more than one risk measures or moments ( Konno et al. 1993; Konno and Suzuki 1995; Sharma and Mehra 2013; Sharma and Mehra 2015; Zhao et al. 2015) or SD (Roman and Mitra 2009) in portfolio optimization framework. Apart from being theoretically sound, portfolio optimization models incorporating SSD criteria from the benchmark portfolio are compu- tational efficient (Fabian et al. 2011; Bruni et al. 2012; Roman et al. 2013; Sharma et al. 2017; Sharma and Mehra 2017; Goel and Sharma 2019; Sehgal and Mehra 2020). Dentcheva and Ruszczy´nski (2003; 2006) used the lower partial (LPM) character- ization of SSD in the constraints from the benchmark portfolio. Roman et al. (2006) in- vestigated the applications of SSD in enhanced indexing by implementing the cutting plane algorithm. Fabian et al. (2011) proposed a portfolio optimization model incorporating SSD

4 using tail risk measures. Roman et al. (2013) analyzed the applications of SSD in portfo- lio generation using the reference point method. Fidan Kececi et al. (2016) compared the SSD constrained model with minimum variance and mean-variance models and observed out-performance of the former in terms of high returns and . Sharma et al. (2017) relaxed the SSD condition by introducing under-achievement and over-achievement variables for the purpose of enhanced indexing in view of the almost SD introduced by Leshno and Levy (2002). Lv et al. (1992) proposed an incremental bundle algorithm using inexact oracle for solving SSD constrained PO model. Guran et al. (2019) applied the mean-variance PO on the energy sector stocks by first eliminating inefficient stocks with the help of SSD efficiency test. Goel and Sharma (2019) introduced deviation measure in SSD and explored its application to enhanced indexing. Sehgal and Mehra (2020) proposed a robust portfolio model with SSD constraints. Zhai et al. (2020) designed a multi-constraint SSD model and solved it using Whale optimization algorithm (WOA). Moreover, extensive empirical results highlighting the superiority of WOA over other swarm intelligence algorithms are validated in the paper. To widen the application of SSD, Liang et al. (2017) and Singh and Dharmaraja (2017) applied the SD selection criteria in disappointment theory and option pricing, respectively.

2.2 Financial Analysis Edirisinghe and Zhang (2007; 2008) employed 18 financial ratios to analyze a firm’s finan- cial strength by proposing a generalized Data Envelopment Analysis (DEA) model. They first selected stocks on the basis of RFSI (relative financial strength indicator) and then formed an optimal portfolio using the mean-variance model. The RFSI was calculated in two ways, one using the raw indicators (total assets, long term debt, accounts receivable, net income and total liabilities) from the financial statements and other using the financial ratios (profitability, liquidity, leverage, valuation and growth ratios). It was observed that RFSI calculated using the latter was shown to have higher correlation with stock returns and the portfolio thus generated performed better. This highlighted the importance of FA in portfolio selection. Some authors (Samaras et al. 2008; Xidonas et al. 2009) used FA to first filter stocks and then used Multi-Criteria Decision Making (MCDM) techniques to construct optimal portfolios. Huang et al. (2014) employed FA using three key indicators- gross profit margin, return on equity and cash flow growth rate to screen fundamentally strong stocks and then an optimal portfolio was generated with the help of integrated DEA-Multi-Objective Decision Making (DEA-MODM) model. Tarczy´nski and Tarczy´nska-Luniewska (2017) emphasized on identifying the leading sec- tors in portfolio analysis. They employed several FRs, namely current ratio, return on assets, return on equity, and debt-equity ratio along with other indicators like size of sector to con- struct a fundamental power index (FPI) of sectors. Sharma and Mehra (2017) designed a two-step SSD portfolio optimization model where the stocks were optimized sector-wise on the basis of four FRs, namely return on assets, current ratio, debt-asset ratio, and P/E ratio to investigate the importance of sector wise selection. Jothimani et al. (2017) formulated a PCA-DEA model for stock selection. They used several FRs as input and output parame- ters, which were first filtered using PCA and thereafter utilized to select efficient stocks by the DEA. Galankashi et al. (2020) employed a fuzzy-Analytic Network Process (fuzzy-ANP)

5 approach in portfolio selection process by incorporating multiple financial indicators, like P/B ratio, P/E ratio, ROA(return on asset), ROE(return on equity), and EPS (earnings per share). More recently, Liagkouras et al. (2020) incorporated value investment in the form of social and environmental factors into the portfolio optimization framework.

2.3 Principal Component Analysis Principal Component Analysis (PCA), a subset of Factor analysis is a statistical tool mainly used for dimensionality reduction in wide variety of domains like computer vision and im- age processing (Hernandez and M´endez 2018), drug discovery and biomedical data (Giuliani 2017), and materials science (Shenai et al. 2012). In Finance, PCA is used for time series forecasting (Chowdhury et al. 2018), financial risk ranking (Fang et al. 2018), forecasting stock prices (Wang and Wang 2015; Zhong and Enke 2017; Ghorbani and Chong 2020), se- lection of relevant financial ratios for predictive analysis (Olufemi et al. 2012) and stock selection for optimal portfolio generation (Yu et al. 2014; Joldes 2018). Factor analysis for financial ratios was first applied by Pinches et al. (1973) where they reduced their data set of bond ratings from 35 variables to 7 variables, retaining around 63% of the total variation. Tan et al. (1997) used factor analysis on 29 ratios for a period of 12 years and derived 8 underlying factors for companies listed in Singapore. Ocal et al. (2007) applied factor analysis on the Turkish construction industry and reduced 25 financial ratios to 5 significant factors. Yap et al. (2013) investigated the application of PCA on 28 financial ratios for two industry sectors, namely consumer and trading sector and found that a smaller set of 7 and 9 ratios, respectively were sufficient and industry specific. Fulga and Dedu (2012) designed a three-step model wherein they first employ PCA and clustering techniques to generate classes of similar stocks. Stocks with minimal Value-at- Risk (VaR) in each class are then filtered and an optimal portfolio is constructed using mean-VaR model on these filtered stocks. Yu et al. (2014) formulated a PCA-SVM stock selection model. They first implemented PCA to reduce the dimensionality of the training set that comprised of data of 20 financial ratios from 677 companies and then employed Support Vector Machine (SVM) algorithm for the selection of final stocks. Nadkarni and Neves (2018) combined PCA with a neural network algorithm to generate optimal trading signals. We observed that either the researchers have totally ignored the importance of FA in the field of portfolio optimization or even if they did, then they implemented FA using different FRs depending on their popular usage. However, which ratio or a set of ratios one should use to analyze the objective at hand is a recurring question. Ideally, the selection of financial ratios should not merely be based on the popularity of their usage but rather on some theoretical or empirical evidence. The SPO model (Sharma and Mehra 2017) effectively implements the use of FR sector-wise in the optimization model but uses fixed four ratios, common for all sectors. This paper is a very first attempt to improve the SPO model by implementing Principal Component Analysis (PCA) for the selection of dominant ratios for each sector.

6 3 The SPO Model

To make the paper self-sufficient, we begin by first describing the SPO model in brief. The following notations are used throughout the paper.

3.1 Notations

Γ time horizon of investment T total number of scenarios of the time horizon Γ N total number of assets s total number of sectors ′ z =(z1, z2,...,zN ) portfolio of N assets; zj denotes fraction of RN total budget invested in asset j ∈ rj random return from jth asset

rjt return realization from jth asset at scenario t

pt probability of scenario t

Rz random return of portfolio z N

Rzt = rjtzj return realization of portfolio z at scenario t j=1 X T N

E(Rz)= rjtzj pt expected value of Rz t=1 j=1 ! X X FRz distribution function of portfolio return Rz

3.2 Financial Ratios and sectors In this subsection, we describe the financial ratios and the economic sectors.

1. Financial Ratios: Benjamin Graham, regarded as the father of fundamental analysis, was the first to idealize the concept of financial ratio analysis (Graham and Dodd 1934; Graham 1949). Financial ratio analysis, a subset of fundamental analysis, analyzes the financial statements of a company to determine its overall financial health. Unlike technical analysis, which is purely based on the market’s past and current trends, fundamental analysis offers a more logical and intellectual decision framework. Financial Ratios can be broadly classified into four categories: liquidity, profitability, solvency/leverage, and valuation whose brief description is given below:

(a) Profitability Ratio (PR) measures and evaluates the ability of a firm to utilize its resources to generate profit. Profits generated by a firm are used to fund future developments and pay off dividends to the shareholders. Therefore, the efficiency of a firm to make profits is an important factor to look after. A higher value of

7 PR is favourable. Gross/net profit margin, return on assets (ROA), return on equity (ROE), earning per share (EPS) are some of the well known profitability ratios. (b) Liquidity Ratio (LR) indicates the ability of a firm to meet its short-term debt obligations. A higher value of LR is preferable. Current ratio, quick ratio, oper- ating cash flow to current liabilities, are few examples of liquidity ratios. (c) Solvency Ratio (SR) depicts the financial risk involved in terms of long-term borrowings. More borrowings imply higher risk and thus a lower solvency ratio is desirable. Debt-equity/assets ratio, interest coverage ratio, capitalization ratio, are some well known solvency ratios. (d) Valuation Ratio (VR) helps to determine the relationship between the current share price of a company with its intrinsic value. It helps in estimating whether a company is exorbitant or cheaper compared to its growth, earnings and other prospects. Price-to-earning/sales (P/E), price-to-book ratio (P/B), dividend yield, and enterprise value multiple, are examples of valuation ratios.

2. Sectors: A sector is a large fragment of an economy where companies that are en- gaged in a similar businesses are involved, like Auto, Finance, and Healthcare among others. Based on market movement, the sectors are broadly classified as cyclical, de- fensive (non-cyclical), and sensitive. Cyclical and sensitive sectors are the ones that follow market trend, while cyclical sectors are highly sensitive to the market peaks and troughs, the sensitive sectors are comparatively moderate in their correlation to such movements. The defensive sectors, on the other hand are those which are least affected with the market oscillations. The defensive sectors are more stable in terms of their earnings than the cyclic and sensitive sectors whose returns are comparatively volatile. Therefore, while comparing stocks on the basis of FA, a firm’s solvency in terms of debt-equity/assets ratio or valuation ratios like P/B and P/E ratios demonstrates bet- ter health of a cyclical or sensitive stock and a firm’s earnings figures like return on asset/equity are more relevant for the defensive sectors. Table 1 describes the major sectors of the Indian economy along with their classification2 and composition 3 in the S&P BSE 500 market. The Finance sector is a combination of private and PSU banks and hence the largest sector under the index. The miscellaneous sector includes those sectors which are either having less than 5 companies, like Tourism, or sectors having diversified businesses. We now describe the two-step SPO model which optimizes the portfolio on the basis of FR sector wise.

3.3 Sector Portfolio Optimization Model (SPO) The SPO model is a two-step PO model (Sharma and Mehra , 2017) that first optimizes the muliti-objective programs to select stocks sector-wise on the basis of following four fixed FRs: 2https://www.bseindia.com/sensex/IndicesWatch Sector.aspx?iname=BSE500&index Code=17 3https://www.valueresearchonline.com/stocks/selector/indices/46/bse-500-index/

8 Table 1: Sector composition in S&P BSE 500

S.No. Sectors Classification % weight

1 Finance Cyclic 16.2 2 Auto(Transport Equipments) Cyclic 6.0 3 Capital Goods (CG) Cyclic 9.0 4 Metals Cyclic 5.0 5 Housing Cyclic 7.2 6 Chemical & Petrochemical Cyclic 6.8 7 Information Technology (IT) Sensitive 3.8 8 Communication Sensitive 2.0 9 Energy (Oil & Gas) Sensitive 5.4 10 FMCG Defensive 8.8 11 HealthCare Defensive 10.0 12 Power Defensive 1.0 13 Consumer Durables (CD) Defensive 3.0 14 Agriculture Defensive 2.0 15 Textile Defensive 2.4 16 Media & Publishing Defensive 3.2 17 Transport services Defensive 1.2 18 Miscellaneous Mixed 7.0

1. Return on Assets (ROA) from the Profitability Ratio 2. Current Ratio (Liq) from the Liquidity Ratio 3. Debt-asset (Debt) Ratio from the Solvency Ratio 4. P/E Ratio (P/E) from the Valuation Ratio The selection of above ratios in each category is due to their popular usage in financial analysis. The optimal stocks found from the first step are then pooled together to generate a final optimal portfolio combining their performance in terms of the mean return and FRs. Step 1: Solving a multi-objective program for each sector In Step 1, s optimization models are solved, separately for s sectors, dominating the bench- mark portfolio in SSD rule in constraints while keeping FRs in the objective functions. Let r r nr be the total number of assets in the rth sector; r =1,...,s and z =(zi ; i =1,...,nr) be the corresponding weights vector, then the models are multi-objective programs, named as (SP 1)r; r =1, . . . , s, defined as follows:

nr nr r r r r (SP 1)r Max (ROA)i zi , Max (Liq)i zi , i=1 i=1 nr nr P r r P r r Min (Debt)i zi , Min (P/E)i zi i=1 i=1 subjectP to P r r + r r + r r r E(η Rz) E(η Y ) ; η = y1,...,yT − r≤ − 0 zi 0.3; i =1,...,nr ≤nr ≤ r zi =1, i=1 P 9 r r r where y1,...,yT are the T realizations of Y , which is a random return of the sectoral r r r r benchmark index and (ROA)i , (Liq)i , (Debt)i , and (P/E)i are respectively the profitability, liquidity, solvency, and valuation ratios of the ith asset from the rth sector. Constraints zr 0.3 impose a 30% upper bound on the investment in each asset in every sector. i ≤ The s multi-objective programs (SP 1)r; r =1,...,s are converted into respective single objective programs (WSP 1)r; r =1,...,s using the weighted-sum approach, given as:

nr nr r r r r r r (WSP 1)r Max w (ROA) z w (Debt) z + 1 i i − 2 i i i=1 i=1 X nr Xnr r r r r r r w3 (Liq)i zi w4 (P/E)i zi i=1 − i=1 X X subject to r + r r + r r r E(η Rzr ) E(η Y ) ; η = y1,...,yT −r ≤ − 0 z 0.3; i =1,...,nr ≤ i ≤ nr r zi =1. i=1 X r The weights wj (j = 1, 2, 3, 4) in the programs (WSP 1)r; r = 1,...,s are obtained by analysing the effect of the individual FR on the Sharpe ratio, higher the effect of particular FR on Sharpe ratio, more weight (or importance) is given to that FR. Step 2: Combining stocks together Step 2 of the SPO model combines optimal portfolio weights (zr)∗; r =1,...,s derived from s r ∗ programs (WSP 1)r; (r =1,...,s) to generate final optimal portfolio. Let mr = i (z ) > { | i 0, 1 i nr be the set comprising of filtered stocks from step 1, then the final optimal portfolio≤ ≤ is obtained} by solving the following optimization problem:

s T r ∗ r (SP 2) Max rjtpt +(zj ) zj r=1 j∈mr t=1   subjectP toP P + + E(η Rz) E(η Y ) ; η = y1,...,yT − r ≤ − 0 zj 0.3; j mr, r =1,...,s ≤ s ≤ ∈ r zj =1, r=1 j∈mr P P where Y is the random return of the benchmark index and y1,y2,...,yT are the T discrete point realizations of Y . Step 2 ensures that the stocks having higher weights from Step 1 get higher allocation in the final optimal portfolio while dominating the benchmark in SSD criteria when combined together. The SPO model incorporates the sector-wise use of FRs in the portfolio construction and has several advantages over the classical SSD model. However, it ignores the fact that the ratios that are found to be critical for one sector may not be so for another sector. This is what formed the basis and primarily led to the following proposed strategy.

10 4 Proposed Strategy: An extension to SPO

4.1 Finding the Relevant ratios for each sector using PCA We aim to improve the SPO model by carefully selecting the critical FRs in its Step 1 for each sector. To meet the objective, we propose to use Principal Component Analysis (PCA), a non-parametric statistical tool to select critical FRs separately for each sector. PCA is the buttress of new age data analysis, devised by Pearson (1901) as an analogue of principal axis theorem in mechanics and later developed and termed by Hotelling (1933). PCA has been profusely used in several disciplines, from biomedical sciences to computer vision. The central idea of PCA is to compress a large data set of inter-related variables into a smaller set of variables while retaining maximum variation as in the original variables. PCA results into a new set of factors, called principal components which are uncorrelated and the first few components cumulatively explain most of the variation (maximum information) as in the original variables. PCA on financial ratios was first applied by Pinches et al. (1973) with an objective to reduce the number of ratios used for predicting market crashes, bond ratings, and company failures. PCA orthogonally transforms a data set of k random variables, denoted as the random vector X = X1 X2 Xk into a smaller set of linearly uncorrelated variables, called ··· principal components. Intuitively, PCA can be visualized as fitting a k dimensional ellipsoid   to the data, axes of which form the principal components. The longest axis of the ellipsoid is indicative of the direction along which the data points have maximum variation (maximum information). If an axis of the ellipsoid is small, then the variation along it is equally small and thus on deleting the axis (and the corresponding principal component) one does not loose out on a lot of information. The aim is thus to determine the axes of the ellipsoid. In this paper, the data set of k variables corresponds to the k financial ratios. If the ith ratio is denoted by X with m independent observations, then X = X1, X2, Xk i ··· becomes a matrix of order m k, X = [Xij]. ×   Steps for determining the Principal Components:

1. Scaling of Data: The mean row of the data matrix X as x = x1 xk of order 1 m ··· 1 k, where xj = Xij j =1, . . . , k. The mean matrix is then computed as × m i=1 ∀ P X = 1T x,

where 1 denotes a row vector of ones of order m. Subtracting X from the data matrix X gives the scaled data matrix, Y = X X − 2. Calculating the Covariance Matrix: The covariance matrix is a k k matrix and 1 × is given by C = Y T Y . m 1 − 3. Computing the principal components: To determine the axes, we compute the eigenvalues and eigenvectors of the covariance matrix C. The eigenvectors of the covari- ance matrix represent the directions of the axes accumulating most of the variation.

11 Table 2: Financial Ratios used in the study

Ratio Label Ratio Name Description

A. Profitability Ratios (larger value is preferable)

NPM Net Profit Margin Net profit/Total revenue ROA Return on Assets Net Income/Total assets CPTI Cash Profit Ratio Cash Profit/Total income ROE Return on Equity Net income/Average Shareholders’ equity B. Liquidity Ratios (larger value is preferable)

QR Quick Ratio Liquid Assets/Current Liabilities CR Current Ratio Current Assets/Current Liabilities CCL Cash Ratio Cash & Cash equivalents/Current Liabilities C. Solvency Ratios (smaller value is preferable)

DER Debt-Equity Ratio Total debt/Shareholders’ equity TLTTA TOL/TNW Ratio Total liabilities/Total tangible assets D. Valuation Ratios (smaller value is preferable)

PER P/E Ratio Share price/Earnings per share (EPS) PBR P/B Ratio Market price per share/Book value per share

Arranging the eigenvectors in descending order of their corresponding eigenvalues gives the principal components (PCs). The eigenvalues depicts the variance accounted by the corresponding PC (eigenvector).

4. Deriving the loadings matrix: After computing the principal components as eigen- vectors, we compute the loadings matrix, comprising of component loadings (or simply loadings) defined as

Component Loadings= √eigenvalue eigenvector × The component loadings denote the covariances (or correlations) between the original variables (here, ratios) and the principal components.

Sum of the squared values of the loadings gives the communality of the variable (ratio) with respect to the component solution, which denotes the variance in the original variable accounted for by the component solution. The extraction of the dominant ratios from PCA can be done in two ways, one on the basis of component loadings and two, on the basis of their communalties as discussed in detail in following section.

4.2 Proposed strategies For empirical purpose in the present study, PCA is applied to a set of 11 financial ratios separately for each sector over a period of 10 years from April 2004 to March 2014. The

12 details of these 11 ratios is given in Table 2. We propose to fetch four components that explain atleast 80% of the total ratio variances. These components together are referred to as the component solution. Though there is no thumb rule for extracting the significant original variables from the PCA solution, but several authors (for example, see Yap et al. 2013) have considered the loads4 from the loadings matrix as a measure of extraction. We adopt the following two methods for extracting the dominant ratios from PCs, ex- plained as follows:

(A) On the basis of component loadings From each principal component, the financial ratio with the highest value (absolute) of component loading is extracted. This gives us the dominant ratios for each sector. These ratios may come from any of the category of the financial ratios, PR, LR, SR, and VR. However, we impose a bound on the extraction of maximum of two ratios from each category.

(B) On the basis of communalities In this extraction method, we fetch the financial ratio having the maximum commu- nality from each category, viz., PR, LR, SR and VR. In this way, we have four critical ratios that represent all four categories for each sector.

After the selection of ratios for each sector, the two-step SPO model is employed to obtain final optimal portfolios. The portfolio selection strategy using extraction method (A) and extraction method (B), respectively are labelled as PCA-SPO(A) and PCA-SPO(B). The purpose of adopting the two extraction methods is to understand the importance and impact of inclusion of critical ratios representing all category of ratios on portfolio performance. To examine the relevance of applying PCA sector-wise, we also apply PCA on the data when all assets are taken together to extract four dominant ratios. Four components that explain atleast 80% of the total ratio variances are extracted. We then apply the nominal SSD model maximizing the weighted sum of the return function and the functions corresponding to the extracted FRs to obtain the optimal portfolio. We named this model as F-SSD. Let us denote the extracted ratios from PCA for the F-SSD model as R1, R2, R3 and R4, then for 0 α 1, the F-SSD model is given as follows: ≤ ≤

(F-SSD) Max α(F R)z + (1 α)E(Rz) subject to − + + E(ξ Rz) E(ξ Y ) ξ R − ≤ − ∀ ∈ z Z, ∈ where Y denotes a random return of the benchmark index, and

N N N N

(F R)z = w1 (R1)jzj w2 (R2)jzj w3 (R3)jzj w4 (R4)jzj. ± j=1 ± j=1 ± j=1 ± j=1 X X X X 4Loads gives the information of the amount of variance of the original variable(ratio), the principal component accounts for

13 The weights w1,w2,w3, and w4 are taken as the proportion of variance the components (PCs) account for, where the sign (“+” for maximization and “ ” for minimization) indicates the effect of that ratio on the return. − For the current framework, we set α =0.5 giving equal importance to returns and FRs. The F-SSD model incorporates FRs in the SSD framework when all stocks are considered together and does not focus on sector-wise bifurcation. Along with the F-SSD model, we also solve nominal SSD model which can be easily obtained from the F-SSD model for α = 0. The nominal SSD model maximizes only the return function of the portfolio and thus the assets are not chosen on the basis of their fundamental performance.

5 Empirical Analysis

5.1 Historical Data and sample period We consider the Indian stock market S&P BSE 500 as our sample data wherein we study following six sectors in the present empirical analysis.

(A) Cyclic/Sensitive Sectors

1 Energy: India is the 3rd largest consumer of energy globally. With the prolif- erating demand, the energy sector is growing robustly and thereby making this sector conducive for investment. 2 Finance: With present growth rate at 8.5% p.a., the finance sector is intrinsically strong and operationally diversified. Mutual Funds Investment and insurance industries are crucial components of this sector, contributing to its expanding growth. 3 Information Technology (IT): The IT industry has played an important role in placing India on the international map. With an export of $70 billion, it is one of the top industry sectors in the country.

(B) Non-Cyclic Sectors

1 FMCG: The FMCG sector is the fourth largest sector in India, with household and personal care contributing half of its sales. Changing lifestyles, easy accessi- bility and increased awareness are prime factors driving its growth, especially in rural India. The sector has seen a revenue growth of 2.1% in the last decade. 2 HealthCare: India’s healthcare sector is the 3rd largest in the world in terms of volume and 14th largest in terms of value. It is highly fragmented and competitive, holding approximately 6-7% of the market share with the leading player. 3 Consumer Durables (CD): The CD sector with a size of $10.9 billion in 2019 is one of the key sectors for India to achieve its self sufficiency or the Make- in-India objective. With changing consumer demands towards convenience and automation, the sector is flourishing and inviting investment opportunities.

14 These sectors together represent approximately 64% of the whole market of S&P BSE 500. Bombay Stock Exchange (BSE) is the oldest stock exchange in Asia and the 10th largest in the world by market capitalization as on March, 2020. On February 19, 2013, BSE partnered with S&P DOW JONES INDICES and got renamed as S&P BSE “*” (* indicates various defined sets like SENSEX, 100, etc). The S&P BSE 500 comprises of 500 companies covering all major industry sectors of the Indian economy and depicts nearly 93% of BSE’s total market capitalization. We collect the following data for stocks listed under the above mentioned 6 sectors of S&P BSE 500 from CMIE Prowess IQ Database (https://prowessiq.cmie.com/):

1. The weekly adjusted closing price from April 2014-March 2020.

2. Quarterly data of all the ratios listed in Table 2 from April 2004-March 2019, giving a total of 40 data points for each asset and each ratio. Thus, for sector r with number of assets nr, the sample data is a matrix with (nr 40) rows and 11 columns (for ratios) for proposed strategies and a matrix with 115× 40 rows and 11 columns in case of F-SSD. × c − c pjt pjt−1 The weekly return for j-th stock in t-th week is calculated as rjt = c ; pjt−1 c c j = 1,...,N, t = 1,...,T, where pjt and pjt−1 are respectively, the closing prices in the t-th and (t 1)-th week. We use R software with SYMPHONY solver on Windows 64 bits Intel(R) Core(TM)− i3-6006U CPU @2.0 GHz processor to solve all optimization models. Stocks whose price data is missing for more than 2 quarters and/or whose ratios data is missing for more than 4 quarters are deleted from the study. The total number of assets henceforth in the present study are left to 115 whose sector-wise composition is as follows:

Energy-14, Finance-32, Information Technology-12, • FMCG-23, Healthcare-27, Consumer Durables-7. • To analyze the financial benefits of the proposed strategies, we follow a rolling window scheme of 4 weeks over the time horizon of total 5 years from April 2014 to March 2020 with in-sample period consisting of 52 weeks. To make the study robust with respect to the investment period, we vary the out-of-sample period over 3 months (OSP3), 6 months (OSP6), 9 months (OSP9), 12 months (OSP12), and 24 months (OSP24) while keeping in-sample period fixed as 52 weeks. By sliding the in-sample period by 4 weeks, we get a total of 51 out-of-sample windows each for OSP3, OSP6 and OSP9, and 49 and 37 windows respectively, for OSP12 and OSP24. The performance of the optimal portfolios obtained from the proposed strategies, PCA- SPO(A), PCA-SPO(B), and the models SPO, SSD, F-SSD along with the benchmark index S&P 500 BSE are compared on the basis of mean return, three risk measures: downside deviation (DD), VaR and CVaR values for 95% and 97% levels of confidence, and the four performance measures: the Sharpe, the Sortino, the Rachev, and the STARR ratios. We report the values of the Rachev and the STARR ratios for 95% and 97% levels of confidence.

15 Table 3: Component loadings matrix for CD Sector

Ratios PC1 PC2 PC3 PC4 Communality Liquidity Ratios QR 0.8885 -0.2441 -0.1309 -0.1369 0.8849 CR 0.9775 -0.0172 -0.1393 -0.0261 0.9759 CCL 0.5454 0.3073 0.3860 -0.6576 0.9734 Profitability Ratios NPM 0.4540 0.8279 -0.0078 0.2826 0.9716 ROA 0.2508 0.9159 0.0049 0.2801 0.9802 CPTI -0.2493 0.8533 0.2938 -0.0169 0.8769 ROE -0.5865 0.7787 0.1791 -0.0373 0.9839 Solvency Ratios DER -0.7159 -0.5552 0.2339 0.0966 0.8848 TLTTA -0.9343 -0.2002 0.2488 -0.0675 0.9794 Valuation Ratios PER -0.6339 0.1966 -0.6654 -0.1718 0.9127 PBR -0.5358 0.6158 -0.3746 -0.3327 0.9173 Cumm. Variance 0.4356 0.7792 0.8705 0.9401

5.2 In sample outcomes Recall that four components that explain atleast 80% of the total ratio variances are extracted for the proposed strategies and as well as for the F-SSD model. Moreover, ratios having the highest component load are selected from each principal component in case of PCA-SPO(A) and F-SSD whereas ratios having maximum communality from each category are selected for PCA-SPO(B). Tables 3 and 4 respectively, illustrate the values of component loading for CD and En- ergy sectors. The four extracted components (PC1-PC4) are referred to as the component solution. We can note that four principal components that are extracted for the sectors CD and Energy together explain respectively, about 94.01% and 90.23% of the total ratio vari- ances. The last column denotes the communalities of the ratios accounted by the component solution. The sign of the loading is indicative of the direction of relationship and is thus ignored while fetching the ratios from the components. From Table 3, we observe that the ratios CR, ROA, PER and CCL are found to be the critical ratios (highlighted in bold) having respective loading as 0.9775, 0.9158, -0.6654, and -0.6576 for PCA-SPO(A) strategy. And for the strategy PCA-SPO(B), the critical ratios become CR, ROE, TLTTA and PBR (bold & italicised) having respective communalities as 0.9759, 0.9839, 0.9794, and 0.9173. Similar readings are for the Energy sector from Table 4. Tables 5 and 6 respectively, explain the ratios selected for both the proposed strategies PCA-SPO(A) and PCA-SPO(B) corresponding to all sectors. Last columns of the Tables describe the cumulative variance accounted by the component solution for each sector. Table 7 displays the values of component loadings when PCA is applied on all assets at once. The dominant ratios are obtained as CR, DER, PBR and CPTI having respectively,

16 Table 4: Component loadings matrix for Energy Sector

Ratios PC1 PC2 PC3 PC4 Communality Liquidity Ratios QR -0.9267 0.0443 -0.3293 -0.0159 0.9695 CR -0.7734 0.3095 -0.4831 -0.0592 0.9309 CCL -0.9271 0.1509 -0.3199 0.0512 0.9873 Profitability Ratios NPM -0.7117 0.3077 0.1476 -0.4258 0.8043 ROA -0.8114 0.2349 0.4572 -0.1343 0.9406 CPTI -0.8958 -0.0007 0.0191 0.2529 0.8669 ROE -0.1725 0.4629 0.7953 -0.0688 0.8813 Solvency Ratios DER -0.2436 0.6389 0.2140 0.6076 0.8826 TLTTA 0.5648 0.6838 -0.1598 0.1275 0.8284 Valuation Ratios PER -0.3858 -0.8310 0.0843 0.2609 0.9147 PBR -0.4787 -0.7479 0.3557 0.0705 0.9200 Cumm. Variance 0.4609 0.6982 0.8357 0.9023

Table 5: Critical Ratios for each sector for the strategy PCA-SPO(A)

Sectors Ratio1 Ratio2 Ratio3 Ratio4 Cummulative Variance of component solution Energy CCL PER ROE DER 90.23% Finance ROE CCL PER TLTTA 93.99% FMCG TLTTA NPM PER DER 94.69% CD CR ROA PER CCL 94.01% IT CR PBR ROE ROA 90.81% HealthCare CCL DER PBR PER 90.04%

Table 6: Critical Ratios for each sector for the strategy PCA-SPO(B)

Sectors LR PR SR VR Cummulative Variance of component solution Energy CCL ROA DER PBR 90.23% Finance CCL ROE DER PER 93.99% FMCG CCL ROE DER PBR 94.69% CD CR ROE TLTTA PBR 94.01% IT CR ROA DER PBR 90.81% HealthCare QR NPM DER PER 90.04%

17 Table 7: Factor loadings matrix for ALL assets

Ratios PC1 PC2 PC3 PC4 Liquidity Ratios QR 0.85450 0.4003 -0.1891 0.1594 CR 0.8561 0.3577 -0.1747 0.2099 CCL 0.8149 0.3868 -0.2745 0.0765 Profitability Ratios NPM 0.8301 -0.4167 -0.0330 -0.1462 ROA 0.6691 -0.6311 0.1598 0.0912 CPTI 0.5896 0.2340 -0.4255 -0.5011 ROE 0.6559 -0.5815 0.1344 -0.0923 Solvency Ratios DER -0.3616 0.7535 -0.1806 -0.2608 TLTTA -0.4178 -0.2654 -0.6889 0.1017 Valuation Ratios PER -0.2069 -0.5821 -0.3829 -0.4210 PBR -0.2629 -0.3390 -0.7060 0.3833 Proportion of Variance 0.4054 0.2268 0.1381 0.0697

0.8561, 0.7535, -0.7060 and -0.5011 loads. The last row represents the proportion of variance each principal component accounts for and are taken as the weights corresponding to each ratio in the F-SSD model. We now analyze the financial benefits of the proposed strategies PCA-SPO(A) and PCA- SPO(B) in comparison to the three models, SPO, F-SSD and the nominal SSD along with the benchmark index S&P BSE 500.

5.3 Out-of-sample analysis Table 8 records the performance of the portfolios derived from PCA-SPO(A), PCA-SPO(B), SPO, SSD, F-SSD, along with the benchmark index for the out-of-sample period of length 3 months. Following observations are drawn from Table 8:

1. The portfolio from PCA-SPO(A) outperforms all portfolios in terms of mean return, Sharpe ratio, and the STARR ratio.

2. The portfolio from PCA-SPO(B) outperforms all other portfolios in terms of all risk measures, the and the Rachev ratios whereas it takes the second best position in terms of the Sharpe and STARR ratios after PCA-SPO(A).

3. The F-SSD model turned out to be the most risky as it suffers with the highest values of DD, CVaR0.95, CVaR0.97, VaR0.95, and VaR0.97 in comparison to other portfolios, which hints that analysing all stocks together on the basis of FRs may mislead an investor. Moreover, not using FRs at all in the selection process can also account for high risk as SSD model stands the second most risky model.

18 Table 8: [OSP3] Out-of-sample performance of portfolios for 3 months

Portfolios BM SSD SPO PCA- PCA- F-SSD SPO(A) SPO(B) Mean Return 0.00289 0.00384 0.00364 0.00471 0.00375 0.00232 Sharpe Ratio 0.33423 0.29883 0.42849 0.51429 0.47407 0.13494 Sortino Ratio 0.79622 0.66021 0.96805 1.59761 1.60163 0.23782 DD 0.00204 0.00389 0.00245 0.00215 0.00155 0.00442 VaR0.95 0.00511 0.01015 0.00639 0.00535 0.00373 0.01087 VaR0.97 0.00538 0.01113 0.00648 0.00598 0.00542 0.01211 CVaR0.95 0.00649 0.01104 0.00807 0.00633 0.00497 0.01250 CVaR0.97 0.00718 0.01149 0.00891 0.00681 0.00558 0.01380 Rachev0.95 1.74422 1.86926 1.80834 3.02685 3.24614 1.03467 Rachev0.97 1.64136 1.91427 1.72559 2.92070 3.22673 0.95072 STARR0.95 0.25027 0.23263 0.29389 0.54263 0.49950 0.08409 STARR0.97 0.22622 0.22352 0.26619 0.50438 0.44489 0.07617

4. None of the proposed strategies, PCA-SPO(A) and PCA-SPO(B) generated worst results in any of the performance measures considered in the study.

Table 9 represents the performance for the out-of-sample period of length 6 months, and based on it, the observations are summarized as follows:

1. The portfolio from PCA-SPO(B) generated the highest value for Sharpe ratio while second highest values for all other ratios maintaining its consistent performance from OSP3 (Table 8) to OSP6 (Table 9).

2. Unlike in the case of the OSP3 (Table 8), the SPO model in this case of OSP6 performed well by achieving best values in CVaR, Sortino, Rachev, and STARR ratios.

3. The SSD model and PCA-SPO(A) respectively achieved the highest and the second highest mean return values whereas portfolios from F-SSD again suffer with the highest risk in terms of DD, VaR0.97, VaR0.97, and CVaR0.95. 4. None of the proposed strategies, PCA-SPO(A) and PCA-SPO(B) generated worst results in any of the performance measures in this case as well. However, the portfolios from PCA-SPO(B) performed better than PCA-SPO(A) in all performance measures reflecting the advantage of including ratios from each category.

Tables, 10, 11, and 12 exhibit the performance for the out-of-sample periods of length 9, 12, and 24 months, respectively. The observations drawn from Tables 10, 11, and 12 are as follows:

1. The portfolios from PCA-SPO(B) performed the best in all performance measures except in mean return where it achieved the 4th, 3rd, and 3rd best positions for OSP9, OSP12, and OSP24, respectively. Though the portfolios from PCA-SPO(A) achieved

19 Table 9: [OSP6] Out-of-sample performance of portfolios for 6 months

Portfolios BM SSD SPO PCA- PCA- F-SSD SPO(A) SPO(B) Mean Return 0.00275 0.00418 0.00398 0.00409 0.00399 0.00297 Sharpe Ratio 0.43705 0.62313 0.75335 0.64677 0.78711 0.34781 Sortino Ratio 1.21629 2.05672 4.53686 2.06079 3.36718 0.95052 DD 0.00122 0.00142 0.00060 0.00137 0.00081 0.00180 VaR0.95 0.00384 0.00354 0.00193 0.00291 0.00110 0.00528 VaR0.97 0.00399 0.00511 0.00212 0.00541 0.00216 0.00554 CVaR0.95 0.00424 0.00475 0.00216 0.00494 0.00281 0.00548 CVaR0.97 0.00444 0.00536 0.00228 0.00596 0.00367 0.00558 Rachev0.95 1.96383 2.79789 4.95216 2.20647 3.98339 2.34976 Rachev0.97 1.91441 2.60821 4.76096 1.83976 3.08719 2.37903 STARR0.95 0.34997 0.61485 1.26024 0.57152 0.97061 0.31222 STARR0.97 0.33421 0.54488 1.19391 0.47371 0.74316 0.30662

best results in terms of mean return over all time periods, they fail to impress much on other performance measures in comparison to other portfolios. These findings highlight the importance of including ratios from each category in portfolio selection process. However, note that PCA-SPO(A) performs better than the benchmark index in terms of all performance ratios over all out-of-sample periods. 2. The SPO model takes the second best position in the performance analysis, which again reflects the importance of considering ratios representing all categories. However, PCA-SPO(B) performed better than SPO indicating the importance of PCA to pick the critical ratio from each category than simply taking fixed ratios, common for all sectors. 3. The F-SSD model suffered with the highest risk again indicating the penalty of not using the FRs correctly in the portfolio selection process.

To summarize, the proposed strategy PCA-SPO(B) and the SPO model demonstrates the overall best and second best results, respectively. Although the strategy PCA-SPO(A) achieves the highest mean return for all out-sample periods, yet it fails to perform well on other parameters, specifically over the 9, 12 and 24 months out-sample periods. However, PCA-SPO(A) never produces worse outcomes in comparison to other models and always generated better results than the benchmark index, especially in terms of mean return and the performance ratios. We also observe that the model F-SSD suffers with the highest risk all the time. Further, the overall analysis can be concluded as follows:

1. The outperformance of PCA-SPO(B) over PCA-SPO(A) indicates the importance of incorporating the FRs from all category of ratios. 2. The outperformance of PCA-SPO(B) over SPO indicates the importance of selecting the critical ratio from each category in the portfolio selection process.

20 Table 10: [OSP9] Out-of-sample performance of portfolios for 9 months

Portfolios BM SSD SPO PCA- PCA- F-SSD SPO(A) SPO(B) Mean Return 0.00257 0.00393 0.00389 0.00397 0.00386 0.00287 Sharpe Ratio 0.49534 0.63932 1.09308 0.76819 1.10309 0.43468 Sortino Ratio 1.95363 3.49980 8.40785 3.03075 23.0615 1.40397 DD 0.00068 0.00077 0.00031 0.00089 0.00011 0.00150 VaR0.95 0.00130 0.00161 0.00042 0.00229 0.00001 0.00333 VaR0.97 0.00175 0.00324 0.00076 0.00313 0.00020 0.00371 CVaR0.95 0.00219 0.00275 0.00108 0.00303 0.00033 0.00394 CVaR0.97 0.00263 0.00332 0.00142 0.00342 0.00049 0.00426 Rachev0.95 2.90259 4.47394 7.74383 3.48515 25.1111 2.66328 Rachev0.97 2.42776 3.79669 6.19718 3.28529 17.2959 2.57277 STARR0.95 0.60393 0.97485 2.44450 0.89622 7.89681 0.40979 STARR0.97 0.50289 0.80748 1.85920 0.79869 5.31826 0.37901

Table 11: [OSP12] Out-of-sample performance of portfolios for 12 months

Portfolios BM SSD SPO PCA- PCA- F-SSD SPO(A) SPO(B) Mean Return 0.00269 0.00393 0.00371 0.00428 0.00384 0.00325 Sharpe Ratio 0.59128 0.73011 1.21828 1.03479 1.32548 0.66745 Sortino Ratio 3.02070 8.10413 15.7811 5.55522 60.3137 3.61616 DD 0.00048 0.00033 0.00016 0.00055 0.00004 0.00055 VaR0.95 0.00081 0.00093 0.00090 0.00119 0.00000 0.00138 VaR0.97 0.00206 0.00097 0.00036 0.00132 0.00080 0.00174 CVaR0.95 0.00222 0.00135 0.00037 0.00229 0.00011 0.00238 CVaR0.97 0.00237 0.00172 0.00109 0.00325 0.00030 0.00302 Rachev0.95 2.92644 7.92592 20.4954 4.13973 70.4545 3.99859 Rachev0.97 2.80590 6.31389 7.24311 3.04769 25.8667 3.16390 STARR0.95 0.65040 1.98701 6.65360 1.32618 23.5777 0.84326 STARR0.97 0.60923 1.55957 2.25855 0.93444 8.64495 0.66455

21 Table 12: [OSP24] Out-of-sample performance of portfolios for 24 months

Portfolios BM SSD SPO PCA- PCA- F-SSD SPO(A) SPO(B) Mean Return 0.00298 0.00352 0.00309 0.00448 0.00350 0.00296 Sharpe Ratio 1.01400 0.992231 1.22427 1.35835 1.44604 1.11777 Sortino Ratio 29.1254 11.5211 37.1722 25.6200 113.687 14.5333 DD 0.00006 0.00020 0.00005 0.00013 0.00002 0.00012 VaR0.95 0.00020 0.00053 0.0000 0.00001 0.00000 0.00039 VaR0.97 0.00027 0.00093 0.00033 0.00077 0.00018 0.00061 CVaR0.95 0.00024 0.00073 0.00017 0.00039 0.00009 0.00050 CVaR0.97 0.00027 0.00093 0.00033 0.00077 0.00018 0.00061 Rachev0.95 22.1489 11.0411 38.5455 22.7692 75.2778 10.6300 Rachev0.97 19.2963 8.72040 19.7273 12.0519 39.2778 9.03280 STARR0.95 7.43629 3.14064 11.2643 8.34294 25.2638 3.45884 STARR0.97 6.47232 2.46523 5.63215 4.22564 12.6319 2.83512

3. The outperformance of PCA-SPO(A), PCA-SPO(B) and SPO over the F-SSD model highlights the importance of sector wise selection.

4. Last but not least, the outperformance of PCA-SPO(A), PCA-SPO(B) and SPO from the SSD model underlines the importance of including FRs in portfolio selection pro- cess.

The above points clearly indicate the importance of incorporating the critical ratio from each category when the selection is done sector wise.

6 Conclusion

The importance of FA in portfolio framework can not be overlooked. However, the right choice of ratios and their correct usage is even more crucial, owing to the fact that companies belonging to different industry sectors have different operational structures and business functionalities. The SPO model aptly considers the sector-wise implementation but chooses ratios com- mon for all sectors. This paper is an attempt forward on improvising the SPO model by implementing PCA for the selection of critical ratios for each sector, extracted in two ways, one from the components (strategy named as PCA-SPO(A)) and other from each category of ratios on the basis of communalities (strategy named as PCA-SPO(B)). We conduct the comparative analysis on the proposed strategies, PCA-SPO(A) and PCA-SPO(B), with the models F-SSD, SSD, and SPO along with the benchmark index. The rolling window anal- ysis which is carried over varying periods of out-of-sample lengths showed that the optimal portfolios obtained from PCA-SPO(B) resulted in lower values of risk measures and gener- ated better risk-reward profile in terms of Sortino, Sharpe, Rachev, and STARR ratios in comparison to other models on almost all out-of-sample periods.

22 The outperformance of portfolios from PCA-SPO(B) over SPO suggests the importance of including the dominant ratios than using a fixed set of ratios, common for all sectors. The strategy PCA-SPO(A) generated higher mean return over almost all out-of-sample periods but was unable to attract much on the other performance measures. The outperformance of PCA-SPO(B) and SPO over PCA-SPO(A) highlights the importance of consideration of each category of ratios. The F-SSD model results in the highest risk after SSD, underlining that mere inclusion of fundamental analysis is not sufficient and its correct application sector-wise is even more crucial. To sum up, the proposed strategy PCA-SPO(B) is an apt decision making process that focuses primarily on the extraction of critical ratios by employing PCA sector wise and then generate final portfolio incorporating the two step SPO model. The financial benefits of PCA- SPO(B) over SPO and other models confirms the practical usage of the proposed strategy. Further extensions of the present work are possible by including other important factors like technical indicators, sustainability, socio-economic variables, and transaction costs.

Acknowledgments The authors acknowledge the support of the Institute of Management Technology (IMT), Ghaziabad for the access to Prowess IQ Database for the collection of data. The first author would also like to thank the Ministry of Human Resource and Development (MHRD), New Delhi, India for financial support.

References

Arkan T (2016) The importance of financial ratios in predicting stock price trends: A case study in emerging markets. Zeszyty Naukowe Uniwersytetu Szczeci´nskiego Finanse Rynki Finansowe Ubezpieczenia, 1 (1 ):13– 26.

Beaver WH (1966) Financial ratios as predictors of failure. Journal of Accounting Research 4 :71–111.

Bruni R, Cesarone F, Scozzari A, Tardella F (2012) A new stochastic dominance approach to enhanced index tracking problems. Economic Bulletin 32 (4 ):3460–3470.

Chen KH, Shimerda TA (1981) An empirical analysis of useful financial ratios. Financial Management, 10 (1 ):51–60.

Chowdhury U, Chakravarty S, Hossain M (2018) Short-term financial time series forecasting integrating principal component analysis and independent component analysis with support vector regression. Journal of Computer and Communications, 6 :51–67.

Delen D, Kuzey C, Uyar A (2013) Measuring firm performance using financial ratios: A decision tree approach. Expert Systems with Applications, 40 :3970–3983.

Dentcheva D, Ruszczy´nski A (2003) Optimization with stochastic dominance constraints. SIAM Journal on Optimization 14 (2 ):548–566.

Dentcheva D, Ruszczy´nski A (2006) Portfolio optimization with stochastic dominance constraints. Journal of Banking and Finance 30 (2 ):433–451.

23 Edirisinghe NCP, Zhang X (2007) Generalized dea model of fundamental analysis and its aplication to portfolio optimization. Journal of Banking and Finance, 31 (11 ):3311–3335.

Edirisinghe NCP, Zhang X (2008) Portfolio selection under dea-based relative financial strength indicators: case of us industries. Journal of the Operational Research Society, 59 (6 ):842–856.

Fabian C, Mitra G, Roman D (2011) Processing second order stochastic dominance models using cutting- plane representations. Mathematical Programming 130 :33–57

Fang L, Xiao B, Yu H, You Q (2018) A stable ranking in china’s banking sector: Based on principal component analysis. Physica A: Statistical Mechanics and its Applications, 492:1997–2009.

Fidan Kececi N, Kuzmenko V, Uryasev S (2016) Portfolios dominating indices: Optimization with second- order stochastic dominance constraints vs. minimum and mean variance portfolios. Journal of Risk and Financial Management, 9 (11 ), DOI https://doi.org/10.3390/jrfm9040011.

Fulga C, Dedu S (2012) Mean-risk portfolio optimization with prior pca-based stock selection. In: The International Workshop Stochastic Programming for Implementation and Advanced Applications, vol Stochastic Programming for Implementation and Advanced Applications, DOI textit 10.5200/stoprog. 2012.07. \ { } Galankashi M, Rafiei F, Ghezelbash M (2020) Portfolio selection: a fuzzy-anp approach. Financial Innovation 6 (1 ):1–34.

Ghorbani M, Chong EK (2020) Stock price prediction using principle components. PLOS ONE 15 (3 ):1–20

Giuliani A (2017) The application of principal component analysis to drug discovery and biomedical data. Drug Discovery Today, 22 (7 ):1069–1076.

Goel A, Sharma A (2019) Deviation measure in second-order stochastic dominance with an application to enhanced indexing. International Transactions in Operational Research, pp 1–30.

Graham B (1949) The intelligent investor. New York:Harper & Bothers.

Graham B, Dodd D (1934) Security Analysis. New York: Whittlesey House McGraw-Hill Book Co.

Guran C, Ugurlu U, Ta¸sO (2019) Mean-variance portfolio optimization of energy stocks supported with second order stochastic dominance efficiency. Finance a Uver, 69 :366–383.

Hadar J, Russell W (1969) Rules for ordering uncertain prospects. The American Economic Review, 59 (1 ):25–34.

Hernandez W, M´endez A (2018) Statistics: Growing Data Sets and Growing Demand for Statistics, United Kingdom: IntechOpen., chap Application of Principal Component Analysis to Image Compression.

Hotelling H (1933) Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24 (7 ):498–520.

Huang CY, Chiou CC, Wu TH, Yang SC (2014) An integrated dea-modm methodology for portfolio opti- mization. Operational Research, 15 :115–136.

Joldes C (2018) The selection of stocks using principal component analysis. In: International Conference on Economics, Business and Economic Thought (EBET), DOI 10.31410/itema.2018.454

Jothimani D, Shankar R, Yadav S (2017) A pca-dea framework for stock selection in indian stock market. Journal of Modelling in Management, 12 :386–403.

24 Konno H, Suzuki Ki (1995) A mean-variance-skewness portfolio optimization model. Journal of Operational Research Society of Japan, 38 :173–187.

Konno H, Shirakawa H, Yamazaki H (1993) A mean-absolute deviation-skewness portfolio optimisation model. Annals of Operations Research, 45 :205–220.

Leshno M, Levy H (2002) Preferred by all and preferred by most decision makers: almost stochastic domi- nance. Management Science 48 (8 ):1074–1085.

Levy H (1992) Stochastic dominance and expected utility: Survey and analysis. Management Science, 38 :555–593.

Liagkouras K, Metaxiotis K, Tsihrintzis G (2020) Incorporating environmental and social considerations into the portfolio optimization process. Annals of Operations Research 12., DOI 10.1007/s10479-020-03554-3.

Liang X, Jiang Y, Liu P (2017) Stochastic multiple-criteria decision making with 2-tuple aspirations: a method based on disappointment stochastic dominance. International Transactions in Operational Re- search, 25 (3 ):913–940.

Lv J, Xiao ZH, Pang LP (1992) An incremental bundle method for portfolio selection problem under second- order stochastic dominance. Numerical Algorithms, 85 :653–681.

Marshall AW, Olkin I (1979) Inequalities: Theory of majorization and its applications. San Diego: Academic Press.

Nadkarni J, Neves RF (2018) Combining neuroevolution and principal component analysis to trade in the financial markets. Expert Systems with Applications, 103 :184–195.

Ocal M, Oral E, Erdis E, Vera G (2007) Industry financial ratios-application of factor analysis in turkish construction industry. Building and Environment, 42 :385–392.

Olufemi A, Fajembola O, Olopete M (2012) Predicting bank failure in nigeria using principal component analysis and d-score model. Research Journal of Finance and Accounting, 3 :159–170.

Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2 :559–572.

Pinches G, Mingo K, Caruthers J (1973) The stability of financial patterns in industrial organizations. Journal of Finance, 28 (3 ):389–396.

Quirk J, Saposnik R (1962) Admissibility and measurable utility functions. Review of Economic Studies, 29 :140–146.

Roman D, Mitra G (2009) Portfolio selection models:a review and new directions. Wilmott Journal, 1 (2 ):69– 85.

Roman D, Darby DK, Mitra G (2006) Portfolio construction based on stochastic dominance and target return distributions. Mathematical Programming, 108 :541–569.

Roman D, Mitra G, Zverovich V (2013) Enhanced indexation based on second order stochastic dominance. European Journal of Operational Research, 228 :273–281.

Samaras G, Matsatsnis N, Zopounidis C (2008) A multi criterion dss for stock evaluation using fundamental analysis. European Journal of Operational Research, 187 (3 ):1380–1401.

Sehgal R, Mehra A (2020) Robust portfolio optimization with second order stochastic dominance constraint. Computers & Industrial Engineering, 144 :106396.

25 Sharma A, Mehra A (2013) Portfolio selection with a minimax measure in safety constraint. Optimization, 62 (11 ):1473–1500.

Sharma A, Mehra A (2015) Extended omega ratio optimization for risk-averse investors. International Trans- actions in Operational Research, 24 :485–506.

Sharma A, Mehra A (2017) Financial analysis based sectoral portfolio optimization under second order stochastic dominance. Annals of Operations Research, 256 :171–197.

Sharma A, Agrawal S, Mehra A (2017) Enhanced indexing for risk averse investors using relaxed second order stochastic dominance. Optimization and Engineering, 18 :407–442.

Shenai P, Xu Z, Zhao Y (2012) Principal Component Analysis - Engineering Applications,, United Kingdom: IntechOpen., chap Applications of Principal Component Analysis in Materials Science.

Silva A, Neves R, Horta N (2014) A hybrid approach to portfolio composition based on fundamental and technical indicators. Expert Systems with Applications, 42 (4 ):2036–2048.

Singh A, Dharmaraja S (2017) Optimal portfolio trading subject to stochastic dominance constraints un- der second order auto regressive price dynamics. International Transactions in Operational Research, 27 (3 ):1771–1803.

Tan P, Koh H, Low L (1997) Stability of financial ratios: A study listed companies in singapore. Asian Review of Accounting, 5 (1 ):9–39.

Tarczy´nski W, Tarczy´nska-Luniewska M (2017) Application of the leading sector identification method in the portfolio analysis. Acta Universitatis Lodziensis, Folia Oeconomica, 2 :185–200.

Wang J, Wang J (2015) Forecasting stock market indexes using principle component analysis and stochastic time effective neural networks. Neurocomputing, 156 :68–78.

Whitmore G (1970) Third-degree stochastic dominance. The American Economic Review, 60 :457–459.

Xidonas P, Mavrotas G, Psarrasa J (2009) A multi-criteria methodology for equity selection using financial analysis. Computers and Operations Research, 26 :3187–3203.

Yap B, Mohamad Z, Chong K (2013) The application of principal component analysis in the selection of industry specific financial ratios. British Journal of Economics, Management & Trade, 3 (3 ):242–252.

Yu H, Chen R, Zhang G (2014) A svm stock selection model within pca. Procedia Computer Science, 31 :406– 412.

Yu L, Wang S, Lai K (2009) Multi-attribute portfolio selection with genetic optimization algorithms. INFOR: Information Systems and Operational Research, 47 :23–30.

Zhai Q, Ye T, Huang M, Feng S, Li H (2020) Whale optimization algorithm for multiconstraint second-order stochastic dominance portfolio optimization. Computational Intelligence and Neuroscience, pp 1–19.

Zhao S, Lu Q, Han L, Liu Y, Hu F (2015) A mean-cvar-skewness portfolio optimization model based on asymmetric laplace distribution. Annals of Operations Research, 226 (1 ):727–739.

Zhong X, Enke D (2017) Forecasting daily stock market return using dimensionality reduction. Expert Sys- tems with Applications, 67 :126–139.

26