arXiv:1402.2494v2 [q-fin.GN] 28 Aug 2014 te lmnsaepeeti nnilmres( markets also financial that in suggests present are which elements evidence, other empirical by lenged † ∗ ( ( effect bias local position example, for ( by, overconfidence studied been have biases and hold, buy. study investors they we stocks stocks market, what what stock between individ- Swedish connection on the the data from financial by investors stock real trading ual to With biases the structure. behavioral to portfolio and relevant patterns directly trading more relating data on focus ( queries search engine ex- web for ( include, proximity and process data ample, trading with actual studied the often to are factors external trading These on influence effects factors. their emotional the and and consider cognitive, also social, investors of the of which decisions in economic models suggested have researchers example, perspec- identical and model rational ( a as agents act been investors that has assuming trading tive, market approach represent traditional fac- to The many and behavior. obvious, trading from influence Yet that far tors way is risk. optimally minimal a trade at such to realized in how is acting profit possible by highest possible make the to as is money goal much trading cost ultimate as the the investors, at For opportunities risk. of provides trading market Stock Introduction f infers investors individual of structure portfolio Stock uhivso ruswt iia rdn eaircnbe can of behavior trading presence similar The with among investors. groups investor heterogeneity of such trading groups a among indi- or to Various investors, rise give conditions. biases external vidual also character- associated are personal and both but istics to information, connected factors rationale by on affected only not trading [email protected] [email protected] rdn yidvda netr n eae behavioral related and investors individual by Trading 1 hc ol eue oer etrudrtnigo tc ma stock of understanding inve better of a portfolios earn stock to used the be that could suggest which findings represe These also cen way. but the strategies, from investment net different consi data the From identify by With similarity. network, similarity portfolio way. representing a similar links in with market a the in model inves trade we investig that Sweden, extent show we great and paper, buy, a they this to stocks In what and practice. hold, investors in betwe connection behavior the trading about known her is little indivi finance, behind in motivation puzzle and of understanding the Although Sweden 2014) 28, University, July Ume˚a (Dated: Lab, Science Integrated Bohlin Ludvig , 2 .Hwvr hsasmto a enchal- been has assumption this However, ). 11 9 ,snainseig( seeking sensation ), .Suisso htivsosbase investors that show Studies ). 4 ,sca ei neatos( interactions media social ), ∗ n atnRosvall Martin and orsodn author Corresponding ; 6 , 7 ,I hswr,w instead we work, this In ), 10 † n h dis- the and ) 3 5 .For ). and ) 8 ), tidvda netr rdn nasimilar a in trading investors individual nt ok efidivso rusta o only not that groups investor find we work, na netrsprflosrcueand structure portfolio investor’s an en t h eainbtenwa stocks what between relation the ate ultaigbhvo sa important an is behavior trading dual oswt iia otoi structures portfolio similar with tors eigivsosa oe,connected nodes, as investors dering ta ocnrt netet nol e tcs( stocks few in- a and only in- in portfolios individual investments diversified concentrate many poorly stead hold that to found tend have vestors in Studies behavior in- trading between their practice. connection and the structure fi- about portfolio in vestors’ known puzzle is important little an nance, is behavior trading individual evdpten ntedt.Ctgre nld,frex- for include, ( Categories chartists ob- and by data. fundamentalists motivated cases ample, the other cases in in patterns some and served in considerations, is theoretical classification straight- by not The is categories forward. distinct such into investors ual yloiga h netr’prflostructure? portfolio investors’ behavior the trading about at their learn looking structure we by questions: market Can main stock (2) And the two in examine portfolios? investors to portfo- do aim share How We (1) in changes time. long-term at over on lios looking focus by instead an- behavior but not trading trading, do direct we trading the research, and alyze previous similarity most Unlike portfolio the similarity. stock study we the between in- Sweden, from in relation individual portfolios shareholdings stock of in on register central data behavior detailed trading With vestors. and structure folio are decisions indi- trading bias and This connected. structure naturally portfolios. portfolio their that in- in cates that hold ones already the vestors typically are stocks attention-grabbing ( approaches network ( traders uninformed and In- ( age similar. example, cer- for are share including, who they properties, if others tain similarly more with trade potentially bond vestors individuals and of behave tendency the to i.e., homophily, by explained lomksi oelkl o niiulivsost in- to ( attention investors their individual attract that stocks for stocks available likely in more vest all it through makes searching also of difficulty The trad- from derived ( are correlations data ing financial in categorizations ( tr odmaigu information, meaningful hold stors 13 lhuhteudrtnigo n oiainbehind motivation and of understanding the Although nti ae,w xlr h oncinbtenport- between connection the explore we paper, this In ktdynamics. rket n aiirt ( familiarity and ) rlrgse fsaeodnsin shareholdings of register tral 17 ,drc tc rdn aa( data trading stock direct ), 19 14 tr rdn behavior trading uture , .Hwvr lsiyn individ- classifying However, ). 16 20 .Ohreape finvestor of examples Other ). ). 15 ,adinformed and ), 22 ,adthese and ), 12 ,gender ), 18 ,and ), 21 ). 2

To answer the questions of individual trading behavior holdings. Direct holdings are registered in the investor’s we take three steps: First, we investigate how individ- name, as opposed to nominee holdings, which are reg- ual investors hold stocks, and how they trade. Second, istered and managed by an equity manager on behalf we divide investors into groups based on portfolio sim- of the investor. The direct holdings of all investors are ilarity. This division is done with a network approach, presented in each half-year report, with detailed infor- where individual investors are considered to be nodes, mation about registration type, share amount, and the and links between investors are constructed according to equity ISIN in which the shares are held. This infor- stock portfolio similarities. To group similar investors, mation makes it possible to find share changes in the we analyze the network with the community detection portfolios between reports, provided that investors have algorithm Infomap (23). Third, we analyze the derived a traceable identification number. Investors who lack a groups to investigate the relationship between portfolio Swedish identification can not be reliably tracked in the structure and trading behavior. This analysis is done data over time, and we therefore excluded these investors by comparing investor trading within groups to investor in the analysis. trading outside the group. In the following section we To reduce the effects of noise in the data, some con- present the methods and the results, and, in short, we ditions for the included stocks were established. First, find that the portfolio structure of individual investors we only considered stocks of companies that existed for holds information on trading behavior, and that investors the entire time period. We therefore excluded stocks that with similar portfolios, to a great extent, trade in a sim- were introduced or removed from the market during the ilar way. time period for any reason. This exclusion was done to enable comparisons between two share reports without Methods changes in the company domain. Furthermore, we also Data from the central register of Swedish required that the total share amount of a stock must not shareholdings have changed more than five percent during the time pe- We examined more than 100,000 individual investors who riod. This condition was set because larger changes make were actively trading in the Swedish stock market from it hard to distinguish actual active trades of investors 2009 to 2011. The investors and their stock portfolios from more passive changes in the portfolios directly re- were extracted from a dataset with around two million lated to a share amount change, as, for example, in the investors. The dataset stems from the central register case of stock splitting. As a consequence of the share of shareholdings in Sweden, and covers basically all in- amount change criteria, we excluded, for example, the vestors and their holdings in every publicly traded com- companies H&M and Swedish Match from the analysis. pany in Sweden. The dataset was provided by Euro- Finally, only listed stocks were considered in the analysis, clear Sweden AB, and permission to use the data was since these stocks are publicly traded and it is possible to given under a special agreement. Data are presented in find an explicit price for them. It is also worth noticing half-year share register reports between June 30, 2009, the distinction between stocks designated A and B on and December 30, 2011, with detailed ownership infor- the Swedish market. A company can be associated with mation of investors in each registered company. The re- more than one stock, because A and B stocks, and other ports also included the companies’ total share amount potential stock classes in a company, must have different and their corresponding stock ISIN (International Secu- ISIN codes. We considered these different stock classes rities Identification Number). Additional data, obtained as separate stocks, because classes with less voting power from the Swedish Central Statistics Office (SCB), pro- usually are more liquid, and therefore give rise to differ- vided share prices for companies listed on the Stockholm ent trading than the ones with superior voting rights. stock exchange. Those data specify share prices at stock In summary, we examined investors on the Swedish exchange closing time, i.e., the price of the latest sold stock market who are natural persons, traceable over share on the last trading day. If price data are lacking, time, active in trading and primarily registered as share- bid price and then ask price were used instead. In total, holders. This selection means that we, for example, ex- the data contain share prices for around 500 listed stocks. cluded investors registered as legal persons and secondary The full dataset makes it possible to extract the detailed ownership through funds. We required company stocks portfolio of an investor in the Swedish stock market. We to be listed and stable, in the sense that they must exist have made anonymized and reduced data available online for the entire time period and have a share amount that as detailed in Ref. (24). Below, we explain the dataset do not change too drastically over time. With this noise in more detail, and the restrictions we set on the data in reduction and data cleaning, we were left with 100,161 the analysis. investors holding capital in 209 different stocks. Investors are reported either as legal persons, e.g. cor- porations and funds, or natural persons, i.e., individual Portfolio vectors and trading vectors investors. Since we are interested in the trading behav- We represent investor holdings in normalized portfolio ior of individual investors, we considered investors that vectors and consider the stock portfolio of an investor as actually changed their portfolio, and focused on the hold- a vector p, where pi represents the investor’s proportion ings that investors can manage themselves, namely, direct of capital invested in stock i. As an example, we can look 3 at an investor with portfolio vector p in a market with in finance research recent years, thanks to its ability to four stocks. If the investor holds shares of total value 20 model the organization and structure of large complex in stock 1, and shares of total value 80 in stock 3, the systems (26). Network approaches aimed to find struc- portfolio vector can be expressed as p = (0.2, 0, 0.8, 0). tures in finance data include, for example, bank-liability Note that the value at a specific index represents the networks (27), stock correlation networks (28) and trad- relative amount invested in corresponding stock, rounded ing networks (20). In our network approach, we model to the nearest hundredth in the analysis for simplicity. the data as a network with investors as nodes connected The portfolio vector representation is used to unify in- by links according to portfolio similarity. Optimally, we vestor trading, even though the shareholding data reports create links between nodes according to causal connec- do not provide direct trading information. However, the tions between investors, but such relationships are diffi- reports do specify detailed half-year snapshots of the in- cult to obtain, and it is not even clear what they would vestors’ portfolios, and these snapshots make it possible be. Instead, we use portfolio similarity as a represen- to track changes in the portfolios over time. Analogously tation of relationships, and connect investor nodes with to portfolio vectors, we therefore construct trading vec- weighted and undirected links, according to the similarity tors for investors based on the sum of all changes between value of the investors’ portfolio vectors. This represen- two dates. In these trading vectors, we considered stocks tation creates a network of investors with links based on in which investors bought shares during the time period. portfolio similarity, and we refer to this network as a sim- We only examined purchases, because correlations be- ilarity network. To account for the similarity values with tween portfolio structure and sold stocks follow trivially the most information and also make the analysis more since investors can buy any stock but only sell stocks efficient, we only consider links with values greater than they already hold. To compute the trading vector, pT , or equal to 0.9 in the similarity network. In the similar- we therefore extracted the positive changes in the port- ity network, investors with at least one common holding folio, pT = p(t2) − p(t1), between reports from June 30, will have a similarity value and accordingly be connected 2009, and December 30, 2011, with stock prices from the by a link. This means that the total number of links in first date. In this way, all elements in the trading vectors the network can be large for a single investor. To reduce become positive. To examine the connection between this complexity and make the analysis procedure feasible portfolios and trading, we computed a similarity value and more effective, we capitalize on the fact that many between investors’ portfolio and trading vectors, based investors have identical portfolio vectors and use this to on cosine similarity (25). Accordingly, the similarity of create a reduced network. Investors with equal portfolio vectors x and y is given by the normalized dot product structures will share the same links to other investors, which results in a large amount of redundant informa- hx, yi sim(x, y)= . tion. To remove this redundancy and reduce network ||x|| · ||y|| size, we therefore represent every portfolio structure as a node, instead of having one node for each single in- We use this similarity measure for the portfolio and trad- vestor. This approach reduces both the number of nodes ing vectors because it is simple and well-suited for analyz- and the number of links in the network. The resulting re- ing investment structure, The similarity value is bounded duced network becomes an aggregated version of the orig- between 0 and 1, since all portfolio and trading vector el- inal network, where links between investors with identical ements are non-negative. portfolio structure are represented by a self-link. An ex- ample of the reduction procedure can be seen in Figure Identifying groups of investors with similar portfolio 1. structure Since single investors hold sparse portfolios and trade to The goal of constructing the similarity network is to a small extent, we need to categorize similar investors group investors with similar portfolio structure, albeit in groups, and examine the overall trading behavior of not necessarily exactly the same portfolio. To identify each group. However, dividing investors into groups with candidates for such groups, we could perform a random similar portfolios is not a straightforward task, since the walk between investor nodes in the network, and in each number of investors is large and it is difficult to distin- step visit a neighboring node proportional to the link guish groups without making assumptions and subjec- weights. In this approach, a group would be a number of tive divisions. One possibility would be to simply group investors where the random walker stays for a relatively investors with the most similar portfolios, but that ap- long time before moving to other investors. However, we proach causes problem on where to separate the groups, cannot identify unambiguous groups simply by perform- and we run the risk of losing important structural infor- ing such dynamics on the network, and therefore we need mation. So we require a method of dividing investors into an extended method. Fortunately, exactly those dynam- groups that accounts for both portfolio similarity and ics are implemented in an existing community detection the structural information of the system. These premises algorithm, namely the map equation (23, 29). For net- can be fulfilled with clustering tools from network theory, work analysis, this algorithm is referred to as Infomap. and we therefore take a network approach to analyze the This algorithm has proven to be one of the most effi- data. Network theory has received increasing interest cient community-detection methods in comparative stud- 4 ies (30, 31). In the case of similarity networks, the algo- choose two other sets of investors, one set from the same rithm identifies groups of investors with strong similari- group and one set created from investors that do not be- ties in portfolio structure, which is precisely what we are long to the group. We compute the aggregated trading looking for. vectors of the two sets and calculate their trading simi- larity in relation to the first trading vector. In this way, 1.0 s the similarity values make it possible to compare trades (A) sab b bc within the group to trades outside the group. The de- 1 tailed procedure looks like: c1 a 1 b 2 For each group G, repeat N times c2 1. Randomly choose a set of investors i1 with set b3 a2 size n from G. c3 2. Randomly choose another set of investors i2 b 4 with set size n from G. 3. Randomly choose a set of investors i3 with set (B) 1 6 3 size n outside G. 4. Compute trading vector similarity sinside = 8sab12 s bc a b c sim(i1,i2) and soutside = sim(i1,i3). 5. Collect data difference, δ = sinside − soutside, Figure 1: Network reduction technique. Example of similarity net- and indicator I. I = 1, if δ > 0, I = 0 other- work reduction from single investor level to portfolio level. (A) Net- wise work at individual level with nine investors and three different port- folio structures a, b and c. Investors are connected with links of In each iteration, we examine whether trades within value 1, sab and sbc. (B) Reduced network at portfolio level, with the three portfolio structures as nodes. Self-links of portfolio nodes a group are more similar than trades outside groups, represent links between investors with equal portfolio structure. The and we repeat this procedure 1000 times. For each weight wi for the self-link of portfolio i, with ni investors, is calcu- group we search for the investor set size that makes the lated according to wi = ni(ni − 1)/2. The link weight wkm between within-group trades significantly more similar than out- portfolios k and m, with nk and nm investors, respectively, is calcu- side trades in the comparison. If portfolio structure and lated according to wkm = wmk = nknmskm, where skm is the link weight between two investors in each portfolio structure. trading were totally dependent, we would have set size 1 for all groups, since this would imply that we can learn To find a group representation that accounts for both about trading of investors in the same group by looking portfolio similarity and market structure, we use the In- at the trading of only one single investor in the same fomap algorithm with the hierarchical clustering option. group. However, the trading data are not very extent for This option provides a division of nodes, i.e., portfolio single investors, and therefore we need to compare the structures, into top-level clusters, consecutively divided aggregated trading of a set of investors to obtain use- into smaller subclusters. We are, however, interested in ful information. To measure the trading similarity of a the division of individual investors, and this division can group, we search for the set size that is needed for sig- be obtained simply by mapping each portfolio to its cor- nificance, i.e., the number of investors that are needed responding investors. In this way, we get a categoriza- so that 95% of trading comparisons are larger within the tion of investors into groups that represent similar, but group than outside. not necessarily identical, portfolio structures. The cat- egorization provides an overview of the stock market, Results and Discussion and describes how investors in the Swedish stock mar- Stock portfolio similarity and trading similarity ket structure their portfolios. Stock portfolios of investors differ by orders of magni- tude, both when considering the number of shareholdings Trading similarity of investors with similar portfolios and the total value. To unify the portfolio structure, we We want to examine if investors with similar portfolio therefore represent investor holdings in normalized port- structures tend to trade more similarly than other in- folio vectors. This representation considers investment vestors. To determine if two investors are similar, we distribution and not the magnitude of investments, which consider the categorization of investors into groups and means that two portfolio vectors can be similar even if study if investors from the same group, i.e., investors the total value of the portfolios differs. Analogously, the with similar portfolio structures, trade in a more simi- vector representation is also used to unify the investor lar way than investors outside the group. Trading com- trading in trading vectors. When we construct portfo- parisons for investors of a specific group are made in a lio vectors for the 100,161 investors in the data, we find bootstrap procedure in the following way: First, we ran- 52,115 different vectors. Interestingly, only 2,652 portfo- domly choose a set of investors from the group and com- lio vectors are needed to cover 50% of all investors, which pute their aggregated trading vector. Next, we randomly shows that a large proportion of investors distribute their 5 capital in a similar way. Many investors have capital in- vested in only one stock, and consequently hold portfo- lios that are not diversified at all. When we construct the trading vectors, we find that 32,970 vectors are needed to cover all existing trading strategies during the period. The number of trading vectors is smaller than for portfo- lio vectors because many investors only invest in one or a few stocks. Individual investors are more likely to invest in stocks that attract their attention, due to the difficulty of searching among all available stocks. Attention there- fore greatly influences individual investor trading deci- sions (22), and the attention-grabbing stocks are natu- rally the stocks that investors already hold. The combi- nation of the attention bias and the tendency of people Figure 2: Relationship between portfolio similarity and trading sim- to act similarly to their peers, as in, for example, local ilarity. Trading similarity relative to mean value for trading until De- bias (4, 8), gives rise to an interesting question. If in- cember 30, 2011, versus portfolio similarity at start June 30, 2009. The figure shows all pairwise comparisons of investors, with port- dividual investors hold portfolios concentrated in only a folio similarity values on the x-axis and trading similarity values on few stocks, have a preference for investing in stocks they the y-axis. The opaque lines show trading similarity relative to mean already hold, and also tend to act in accordance with sim- value of all trading similarities, as a function of portfolio similarity, ilar investors, does this imply that there is a connection and the shaded transparent areas show the non-parametric 95% con- between portfolio structure and trading similarity? The fidence intervals. The blue line with circles shows the relationship when all investments of investors are considered, and the red line portfolio and trading vectors make it possible to evaluate with x-marks shows the relationship when only investments in new the question and compare investors, and in Figure 2, we stocks are considered, i.e., investments in stocks that the investors show the relationship between portfolio vector similarity does not already hold. The large variations are results of many zero and trading vector similarity. The variation in the data values for trading similarities, since individual investors do not trade is large, but a trend can be seen in the case when all in- to such a large extent. vestments are considered; the more similar the portfolio structure, the more similar the trading. The relation- ship is evident for portfolio similarity values greater than vestors, since larger groups decrease the number of sim- 0.9, which suggests that these similarity values hold im- ilarity comparisons that become zero. The figure also portant information. The observed relationship between shows why we do not want to form these groups ran- portfolio structure and trading could be explained with domly, as larger groups cause similarity values to end up homophily, i.e., the tendency of individuals to engage in in a narrower interval. This shift is a result of the random similar activities to their peers. This tendency can some- group formation and demonstrates that the information times make it hard to determine from observational data in such groups is limited, since portfolio dependencies whether a similarity in behavior exists because two in- disappear when investors are chosen randomly. Conse- dividuals are similar, or because one individual’s behav- quently, both group size and how we aggregate groups ior has influenced the other. Because of the of are important factors when we examine the relationship the shareholding data, it is difficult to determine causal between portfolio structure and trading, reasons for the observed similarities, but since we are primarily interested in the connection between portfolios Groups of similar investors from similarity network and trading similarity, this is not an issue. analysis Comparisons of single investors result in a large pro- To find groups and analyze the aggregated trading be- portion of similarity values that become zero, both in the havior of investors, we model the shareholding data as a comparisons of portfolio and trading vectors. This means network with investors as nodes connected by links ac- that many investors neither hold nor trade similar stocks, cording to portfolio similarity. The network approach and therefore the evaluation of single investor compar- creates a similarity network, and we analyze this net- isons becomes problematic. To overcome this problem work with the community-detection algorithm Infomap and be able to compare investors, as a first approach, we (23) to identify groups of similar investors. The groups created groups of randomly chosen investors, and com- describe how investors in the Swedish stock market struc- pared the group’s aggregated portfolio and trading vec- ture their portfolios, and the basic properties for the ten tors to other groups of equal size. The aggregated port- largest top-level groups can be seen in Table 1. It is folio and trading vectors are constructed as the mean in- worth noticing that more than two-thirds of all investors vestment distributions of all investors in the group. The are included in the ten largest groups. Despite the al- distributions for portfolio and trading similarities, with most endless number of ways for individual investors to group sizes 1, 10 and 100, are shown in Figure 3. First structure their portfolios, the analysis shows that a few of all, the figure illustrates why we want to group in- related investment strategies are favored. 6

cance in trading similarity, i.e., the number of investors that are needed so that 95% of the comparisons between aggregated trading vectors are larger within the group than outside. The results are presented in Table 1, and we can see that only one investor is required for signifi- cance in group 4, while 43 investors are needed in group 1. The number varies between groups, which means that investors with certain portfolio structures tend to trade more similarly than others. The group differences are il- Figure 3: Portfolio similarity and trading similarity distributions. lustrated in Figure 4, where the mean trading similarity Distributions of investor portfolio similarity and trading similarity is shown in relation to mean portfolio similarity, for all in- for groups consisting of 1, 10 and 100 randomly chosen investors. Portfolio similarities are computed for June 30, 2009, and trading vestors within the groups. Noticeable is that group 1 has similarities are computed from the positive changes in the portfolios relatively low scores for both portfolio and trading simi- until December 30, 2011. The figure shows that a large proportion larity, which can be explained by the fact that the group of investor comparisons are zero in the case of 1-investor compar- is large and therefore diverse when it comes to both port- isons, both in the case of portfolio similarity and trading similarity. folio structure and trading. Group 4, on the other hand, For larger groups, the similarity values increase and concentrate to a narrower interval. has a relatively high similarity score for both portfolio and trading similarity. This suggests that the investors within group 4 are more homogeneous than investors in other groups when considering both portfolio structure The investor groups represent related portfolio struc- and trading behavior. Unique to this group is the Saab tures, and in each group we find some stocks that a large B-stock, which is held by all investors in the group. It proportion of the investors hold. These top stocks consti- is also interesting to compare the trading behavior of tute the main connectors between investors in the group. group 2 and 8, since group 2 has a lower portfolio sim- The Ericsson B-stock, which is the stock held by most ilarity score, but still a higher trading similarity score investors, represents the top stock in the first and largest than group 8. An explanation for these differences could group. Almost three quarters of the around 25,000 in- possibly be found by looking at the top stocks of each vestors in the first group hold shares in Ericsson B. Gen- group, see Table 1. The three top stocks of group 2 are eral recommendations on how to invest in the stock mar- all in the car industry sector, while the three top stocks ket state that diversified portfolios are preferred, but in- in group 8 are from three different sectors, mining indus- vestors still seem to make the choice to hold underdiver- try, telecommunication and technology. Group 2 there- sified portfolios (32). As a result of this bias, the mean fore seems to represent a more homogeneous ownership, number of stocks held by individual investors is relatively and consequently the investors in the group trade more small, which, in turn, makes it possible to identify groups similarly than the investors in the more diverse group 8. of investors with some specific stock structures in com- mon. When considering portfolio diversity it is worth noticing the possibility that investors also can have cap- ital invested in, for example, diverse funds, but such sec- The group trading similarity may be due to that in- ondary ownership is not included in the analysis. Individ- vestors of the same group base trading decisions on sim- ual investors tend to hold only a few different stocks, and ilar information because they, for example, possess the this limitation can actually be beneficial, since gathering same information sources, such as web sites or television information on stocks requires resources (33). Individual (35). These information sources are more likely to be sim- investors seldom have resources to gather information on ilar if investors share common interests, which for group more than a few stocks, and informed investors therefore 2 potentially could be cars or the automotive industry. tend to concentrate their portfolios in the stocks in which There is also evidence of communication among stock they hold an informational advantage (34). market investors, which suggests that investors exchange information about trading in discussions with their peers Similar portfolio structure infers similar trading (36, 37). Accordingly, social interaction is an influential The investor groups make it possible to compare the trad- factor when it comes to stock market trading (38). There- ing of investors with similar portfolio structures. How- fore, our empirical findings on stock portfolio structure ever, the relationship between portfolio structure and could in principle be used to refine multi-agent based trading behavior is dependent on what stocks investors order book models with different types of agents (39). hold, and therefore the relational effect varies between However, more research is needed to bridge the gap in groups. To investigate this relationship, we use a boot- time scales between long term investments and short term strap procedure in which we compare within-group trades trades. to outside-group trades and search for the investor set To put the results in the context of previous work, we size that makes trades significantly more similar within consider some studies that have examined the joint be- the group. The significant set size specifies the number havior of individual investors, although not from a net- of investors from the group that is needed for signifi- work perspective. Ref. (40) analyzed household trading 7

Table 1: Properties of the ten largest groups obtained from clus- tering the similarity network with 100,161 investors. Investors shows the number of investors in the group. Mean stocks reports the mean number of portfolio holdings for the investors in the group. Signif- icant set size states the set size that is needed for trading signifi- cance, i.e., the number of investors that are needed so that 95% of the trading comparisons are larger within the group than outside. Top stocks shows the stocks held by most investors in the group and the corresponding proportion of investors in the group that hold the stock.

Group Investors Mean Significant Top stocks stocks set size 1 25,539 2.0 43 72% Ericsson B 44% TeliaSonera 8% Volvo B 2 12,289 2.1 5 81% Volvo A 36% Volvo B Figure 4: Group differences in portfolio similarity and trading sim- 35% Scania A ilarity. Relation between mean trading similarity and mean portfolio 3 5,793 2.4 4 98% Sandvik similarity for the investors of the ten largest groups obtained in the 25% Ericsson B analysis of the similarity network. The large and diverse first cluster 15% Seco Tools B neither score high on portfolio similarity nor trading similarity, while 4 3,769 1.2 1 100% Saab B group four seems to have most homogeneous investors, both when 8% TeliaSonera considering portfolio similarity and trading similarity. The portfolio 4% Ericsson B similarity values are all greater than 0.5, since the groups are created 5 4,589 2.5 21 98% SEB A with portfolio similarity as a condition. 28% Ericsson B 22% TeliaSonera 6 3,093 3.8 6 100% Skanska B 35% Ericsson B 32% Fabege in a similar way. To analyze this relationship, we use 7 3,839 3.6 33 100% Nordea real stock market data and a procedure that is threefold. 44% Ericsson B 39% TeliaSonera First, we find that comparisons of portfolio and trading 8 2,891 3.2 38 100% Boliden similarity for single investors show a large variation, and 43% Ericsson B therefore the data must be analyzed on an aggregated 23% TeliaSonera level. Second, we find that the stock market displays a 9 3,056 5.5 20 91% Handelsb. A structure among its investors, with groups that represent 45% Handelsb. B 42% Ericsson B investors with similar portfolio structures. Third, we find 10 2,923 4.4 38 100% Investor B that investors with similar portfolios, to a greater extent, 49% Ericsson B trade in a similar way. 23% TeliaSonera The results show that the stock portfolios of individ- ual investors hold meaningful information, which could and found that trading was highly correlated and per- be beneficial in the analysis of individual trading behav- sistent. The study observed that individual investors ior. The use of new data sources in economics could tend to react to the same kind of behavioral biases at, improve our understanding of dynamics in financial sys- or around, the same time. Such behavioral biases could tems and make it possible to develop models for inferring lead to associated trading for related investors, and an market reactions. However, even though new and pre- explanation for trading similarity could therefore be that viously unused data can provide important information similar investors seek and receive similar information over that relates to market dynamics, the problem of evaluat- time, and correspondingly trade in accordance. Another ing whether the featured relations are causal or not still explanation to the trading similarity among groups re- persists. Therefore, while future work on the relation- lates to investment herding (41), where some investors ship between portfolio and trading includes examining change their portfolio in the same way as a leading group the results from an economical perspective and connect- of investors which they trust. In the end, it is interest- ing them to actual market dynamics, the general goal of ing to consider the fact that the stock portfolio of an future work in finance will be to further explore causality investor actually reflects the aggregated result of all past when connecting data to dynamics. trades done by the investor, and therefore a relationship between portfolio structure and trading already exists. Acknowledgements Conclusion We are grateful to Euroclear Sweden AB for providing We show that there is a relationship between stock port- the shareholding data on the Swedish stock market, and folio structure and trading, namely, that individual in- to Krister Modin, who helped with advice and interpre- vestors with similar portfolio structures tend to trade tation of the data. 8

Author contributions 2011. LB and MR derived the research project and wrote the 19. Michele Tumminello, Fabrizio Lillo, Jyrki Piilo, and manuscript. LB performed numerical simulations. Rosario N Mantegna. Identification of clusters of investors from their real trading activity in a financial market. New Journal of Physics, 14(1):013041, 2012. References 20. Zhi-Qiang Jiang and Wei-Xing Zhou. Complex stock trad- 1. Cars H Hommes. Heterogeneous agent models in eco- ing network among investors. Physica A: Statistical Me- nomics and finance. Handbook of computational eco- chanics and its Applications, 389(21):4929–4941, 2010. nomics, 2:1109–1186, 2006. 21. Laurent E Calvet, John Y Campbell, and Paolo Sodini. 2. Nicholas Barberis and Richard Thaler. A survey of be- Down or out: Assessing the welfare costs of household havioral finance. Handbook of the Economics of Finance, investment mistakes. Technical report, National Bureau 1:1053–1128, 2003. of Economic Research, 2006. 3. Brad M Barber and Terrance Odean. The behavior of 22. Brad M Barber and Terrance Odean. All that glitters: individual investors. Available at SSRN 1872211, 2011. The effect of attention and news on the buying behavior of 4. Andriy Bodnaruk. Proximity always matters: Local bias individual and institutional investors. Review of Financial when the set of local companies changes. Review of fi- Studies, 21(2):785–818, 2008. nance, 13(4):629–656, 2009. 23. Martin Rosvall and Carl T Bergstrom. Maps of ran- 5. Johan Bollen, Huina Mao, and Xiaojun Zeng. Twitter dom walks on complex networks reveal community struc- mood predicts the stock market. Journal of Computa- ture. Proceedings of the National Academy of Sciences, tional Science, 2(1):1–8, 2011. 105(4):1118–1123, 2008. 6. Tobias Preis, Helen Susannah Moat, and H Eugene Stan- 24. Portfolio data. Anonymized identities. Individuals with ley. Quantifying trading behavior in financial markets identical portfolios are aggregated., 2014. using trends. Scientific reports, 3, 2013. 25. Alexander Strehl, Joydeep Ghosh, and Raymond Mooney. 7. Helen Susannah Moat, Chester Curme, Adam Avakian, Impact of similarity measures on web-page clustering. Dror Y Kenett, H Eugene Stanley, and Tobias Preis. In Workshop on Artificial Intelligence for Web Search Quantifying usage patterns before stock mar- (AAAI 2000), pages 58–64, 2000. ket moves. Scientific reports, 3, 2013. 26. Franklin Allen and Ana Babus. Networks in finance1. 8. Mark S Seasholes and Ning Zhu. Individual investors The network challenge: strategy, profit, and risk in an and local bias. The Journal of Finance, 65(5):1987–2010, interlinked world, page 367, 2009. 2010. 27. Michael Boss, Helmut Elsinger, Martin Summer, and Ste- 9. Terrance Odean. Volume, volatility, price, and profit when fan Thurner 4. Network topology of the interbank market. all traders are above average. The Journal of Finance, Quantitative Finance, 4(6):677–684, 2004. 53(6):1887–1934, 1998. 28. Wei-Qiang Huang, Xin-Tian Zhuang, and Shuang Yao. A 10. Mark Grinblatt and Matti Keloharju. Sensation seeking, network analysis of the chinese stock market. Physica A: overconfidence, and trading activity. The Journal of Fi- Statistical Mechanics and its Applications, 388(14):2956– nance, 64(2):549–578, 2009. 2964, 2009. 11. Hersh Shefrin and Meir Statman. The disposition to sell 29. Martin Rosvall, Daniel Axelsson, and Carl T Bergstrom. winners too early and ride losers too long: Theory and The map equation. The European Physical Journal Spe- evidence. The Journal of finance, 40(3):777–790, 1985. cial Topics, 178(1):13–23, 2009. 12. George M Korniotis and Alok Kumar. Do older investors 30. Andrea Lancichinetti and Santo Fortunato. Community make better investment decisions? The Review of Eco- detection algorithms: a comparative analysis. Physical nomics and Statistics, 93(1):244–265, 2011. review E, 80(5):056117, 2009. 13. Brad M Barber and Terrance Odean. Boys will be boys: 31. Rodrigo Aldecoa and Ignacio Mar´ın. Exploring the limits Gender, overconfidence, and common stock investment. of community detection strategies in complex networks. The Quarterly Journal of Economics, 116(1):261–292, Scientific reports, 3, 2013. 2001. 32. William N Goetzmann and Alok Kumar. Why do individ- 14. Massimo Massa and Andrei Simonov. Hedging, famil- ual investors hold under-diversified portfolios? Technical iarity and portfolio choice. Review of Financial Studies, report, Yale School of Management, 2005. 19(2):633–685, 2006. 33. Robert C Merton. A simple model of capital market equi- 15. Jeffrey A Frankel and Kenneth A Froot. Chartists, fun- librium with incomplete information. The Journal of Fi- damentalists, and trading in the foreign exchange market. nance, 42(3):483–510, 1987. American Economic Review, 80(2):181–85, May 1990. 34. Zoran Ivkovic, Clemens Sialm, and Scott Weisbenner. 16. Raymond da Silva Rosa, Nirmal Saverimuttu, and Terry Portfolio concentration and the performance of individual Walter. Do informed traders win? an analysis of changes investors. Journal of Financial and Quantitative Analysis, in corporate ownership around substantial shareholder no- 43(3):613–656, 2008. tices. International Review of Finance, 5(3-4):113–147, 35. Eytan Bakshy, Itamar Rosenn, Cameron Marlow, and 2005. Lada Adamic. The role of social networks in informa- 17. Giulia Iori, Roberto Reno, Giulia De Masi, and Guido tion diffusion. In Proceedings of the 21st international Caldarelli. Trading strategies in the italian interbank mar- conference on World Wide Web, pages 519–528. ACM, ket. Physica A: Statistical Mechanics and its Applications, 2012. 376:467–479, 2007. 36. Robert J Shiller and John Pound. Survey evidence on dif- 18. Junjie Wang, Shuigeng Zhou, and Jihong Guan. Charac- fusion of interest and information among investors. Jour- teristics of real futures trading networks. Physica A: Sta- nal of Economic Behavior & Organization, 12(1):47–66, tistical Mechanics and its Applications, 390(2):398–409, 1989. 9

37. Zoran Ivkovi´cand Scott Weisbenner. Information diffu- 75(3):510, 2006. sion effects in individual investors’ common stock pur- 40. Brad M Barber, Terrance Odean, and Ning Zhu. Do re- chases: Covet thy neighbors’ investment choices. Review tail trades move markets? Review of Financial Studies, of Financial Studies, 20(4):1327–1357, 2007. 22(1):151–186, 2009. 38. Harrison Hong, Jeffrey D Kubik, and Jeremy C Stein. 41. Mark Grinblatt, Sheridan Titman, and Russ Wermers. Social interaction and stock-market participation. The Momentum investment strategies, portfolio performance, journal of finance, 59(1):137–163, 2004. and herding: A study of mutual fund behavior. The 39. Tobias Preis, Sebastian Golke, Wolfgang Paul, and American economic review, pages 1088–1105, 1995. Johannes J Schneider. Multi-agent-based order book model of financial markets. EPL (Europhysics Letters),