IMPROVIESED FOR FUZZY OVERLAPPING COMMUNITY DETECTION IN SOCIAL NETWORKS

1HARISH KUMAR SHAKYA, 2KULDEEP SINGH, 3BHASKAR BISWAS

123Indian Institute of Technology, Varanasi, (UP) India, India Email: [email protected],[email protected], [email protected]

Abstract: Community structure identification is an important task in social network analysis. Social communications exist with some social situation and communities are a fundamental form of social contexts. Social network is application of web mining and web mining is also an application of data mining. Social network is a type of structure made up of a set of social actors like as persons or organizations, sets of pair ties, and other interaction socially between actors. In recent scenario community detection in social networks is a very hot and dynamic area of research. In this paper, we have used improvised genetic algorithm for community detection in social networks, we used the combination of roulette wheel selection and square quadratic knapsack problem. We have executed the experiment on different datasets i.e. Zachary’s karate club [31], American college football [39], Dolphin social network [32] and many more. All are verified and well known datasets in the research world of social network analysis. An experimental result shows the improvement on convergence rate of proposed algorithm and discovered communities are highly inclined towards quality.

Index term: community detection, genetic algorithm, roulette wheel selection,

I. INTRODUCTION optimization problem, which involves a quality function first defined by Newman and Girvan [3], A social network is a branch of data mining which involves finding some structure or pattern amongst the set of individuals, groups and organizations. A social network involves representation of these societies in the form of a graph with the individuals and this quality function, called modularity, gave an as the vertices and the relationship among the approximate measure of the validity of the formed individuals being represented by the edges [1]. community structure. For each subset, the partition is Certain individuals of a social network are said to said to be better, if the number of excess of links in belong to a community if they have large number of each module is larger than the number of excess of interconnections. Similarly two individual links in the random case. So, the best partition of the communities will have less number of connections given social network will be the one in which the between them. In normal terms, community structure modularity has maximum value. Optimizing the value is defined as the groups of vertices such that the of the modularity of a community structure is a number of edges within the individuals of each group challenging task, since when the size of the social is greater than the number of edges that connect the network increases, the number of partition will individuals of this group to the rest of the groups in increase in at least exponential relation with the size. the community network. This complete process of It has been recently proved that optimizing the finding the exact community structure of any given modularity of a community structure for a given social network is called community detection in social network is an NP-complete problem, [4] so it social network [2].Community structure in any given is not possible to find the exact optimum value of social network gives us an indication of some modularity in polynomial time with the help of a important pattern which may be hidden on normal deterministic algorithm but we can find a good analysis, and thus can help us to understand a lot of approximate solution in polynomial time using processes and phenomenon of social networks and various techniques, like greedy agglomeration [5][6], communities better. This also helps when someone [7][8][9], extremal optimization makes an application using the social network and its [10] and spectral division [11]. communities. Community detection in social networks can be done The validity of the methods used for community with the help of genetic algorithm [12].The basic detection in social networks is based on the structure of a genetic algorithm involves three steps: “goodness” of the found communities, which is Selection operation, crossover operation and mutation evaluated with the help of some quality function. The operation [13]. Selection operation involves selecting problem of identifying different communities in given a certain fraction of individuals from the network as it social networks has been changed into an is and copying them to the next generation. This selection is made for the best communities. This is in

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 53 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks consistence with biological genetics assuming that a time complexity O(m2n) for a graph with n certain fraction of best or most fit individuals will vertices and m edges. The steps of this algorithm survive. The crossover operation is like reproduction include centrality computation, removal of edge in biology. This involves mixing the genes of two with largest centrality, recalculation of centrality individuals. Similarly in case of community and iteration of the cycle from step 2. detection, two individuals are crossover to form a  Modularity Maximization [17]: This is the most new individual in next generation. This individual has widely used method. Modularity function some of the genes of one parent and some of them of measures quality of the community structure. In the other one. Mutation operation is identical to modularity maximization method, we search biological mutation in which some of the individuals over all possible division of the network and adapt to form new species with changed genetic obtain the division with high modularity. But this material. Similarly, in community detection using solution n intractable. Approximate solutions are genetic algorithm, we modify the genes or made. Some of these approximate optimization characteristics of some percentage of individual to method include greedy algorithms, simulated form a new species in next generation. annealing and spectral optimization. The first Three types of partitions, possibilistic partitions, was proposed by Newman. fuzzy partitions, and crisp/hard partitions for This algorithm tends to form large community. communities are possible. For fuzzy community In simulated annealing, the probabilistic partition, fuzzy c-means clustering [14] is used. procedure is followed and the global optimum of a function is found. Modularity optimization fails II. LITERATURE REVIEW to detect clusters smaller than a particular value called resolution limit. Community detection is an important research topic In Fuzzy community detection algorithms, we define in the field of complex networks. Genetic algorithms a soft membership vectors or belonging vector for have been used as an effective optimization technique each node. This is used to quantify association to solve this problem. The earliest method was called strength in each community. The following the minimum-cut method. Then hierarchical researches have been made in this field: clustering method came up. This was followed by  Nepusz used simulated annealing [18] Girvan Newman algorithm which was further method. He converted the problem into a optimized using modularity maximization. The non-linear constraint optimization problem. algorithms in detail are:  Zhang used spectral clustering [19] framework and proposed an algorithm to do  Minimum-Cut Method [15]: In this method the effective division of social network into number of partition was predetermined and then communities. He used Fuzzy C-Means the network was divided. It was ensured that the clustering [20] (FCM) to obtain the soft division was such that the community was of assignment. Users specify an upper bound to approximately the same size. Also the number of the number of communities. If this upper intergroup edges is minimized. This method is bound is k then, only the top k-1 Eigen less than ideal as it finds only a fixed a number vectors are computed. Both, accuracy and of communities. time complexity rely on the value of k.  Hierarchical Clustering [16]: In hierarchical  In 2007, Newman and Leicht used a mixture clustering, a similarity measure is defined. This models [21] and provided an appropriate quantifies topological type of similarity between Fuzzy community detection method. This nodes. Method used includes cosine similarity, was possible only due to probabilistic nature jaccard index and haming distance between of the algorithm. In this, the number of adjacency matrix rows. Similar nodes where communities is same as the mixture models. grouped into one community. Two methods were This number needs to be specified in used for this grouping: Single linkage clustering advance. and complete linkage clustering. In the former, Researchers are being made in the field of fuzzy two groups are different iff all pair of nodes in community detection using evolutionary algorithms. different groups have similarity less than a given The following models were developed in the past: value. In the later, all pair of nodes within a  Modularity-based Model: In 2004, Newman group has similarity greater than a threshold. and Girvan devised an evaluation criteria  Girvan Newman Algorithm: Edges that lie called modularity denoted by Q. This between two communities are identified and criterion takes into account that the number removed. Identification is performed by of edges within a community are maximised between-ness measure in which a number is and the inter-community edges are assigned to each edge. This number is large if minimised. Higher the value of modularity, edge lie between many pair of nodes. This the better the solution is. However, this algorithm gives quality result but is slow with approach had several drawbacks [22]. Firstly

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 54 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks optimizing Q has been categorised as an NP- of time. A bi-objective optimization problem hard problem. Secondly, large Q values was proposed to solve community detection don’t always prove better community in dynamic networks. Normalized mutual partition as random networks with no Information [29] (NMI) is a similarity index community structures can also have high Q used in Information Theory. Given A and B values. Thirdly and most importantly, Q has as two partitions of network NMI (A, B) a resolution limitation, i.e. community with gives the similarity between the two sizes smaller than a threshold value are not partitions. If NMI (A, B) = 1 the partition is detected. same. If NMI (A, B) = 0 then the partition is  Multi-resolution Model [23]: This model completely different. was introduced to overcome the resolution limitation problem of modularity model. III. PROPOSED WORK Pizzuti, in 2008, proposed a genetic algorithm for community detection using In this paper, we have done different type of multi-resolution model. In this a community experiments on genetic algorithm and check the score evaluation is done taking into performance of modified GA. I have done the consideration that the number of edges modification in the algorithm but not change in within a community are maximised and the internal architecture. inter-community edges are minimised. Input: In this Genetic algorithm, input datasets in the  General Model: As we know for the best form of adjacency matrix and some other input community partition dense links within the parameters given below in Table 1. And Table 2. community should exist and sparse links Parameter Value Description between two communities should exist. m 1.7 Used in determining the Pizzuti and Pizzuti proposed an algorithm membership of each node called MOGANet. In this he used fast elitist cp 0.1 Percentage of individual genetic algorithm for sorting which was non- selected directly dominated. They defined two parameters, npc 10 Number of individuals with Community Scores (CS) and Community given number of partition Fitness (CF). Butul and Kaya improved the pm 1.0 Mutation percentage MOGANet by using meta-heuristics [24]. pc 0.9 Cross-over percentage They used enhanced firefly algorithm [25] Occ 10 Number of occurrences of followed by harmony search algorithm [26] max generation with termination and chaotic local search. Zhang and Li condition proposed a decomposition based method 10-5 Termination condition [27].  Signed Model: Signed networks are tmax 100 Number of iterations networks in which the vertices have friendly cmin 2 Minimum number of or hostile relation between each other partitions of social network depending on the sign assigned to them. c 10 Maximum number of Gong, in 2014, proposed a PSO framework max partitions of social network which included two parameters, Signed Table 1: Values of different parameters Ratio Association (SRA) and Signed Ratio Output: partition and the cover matrix (U). Cut (SRC). In 2013, Amelio and Pizzuti put Terminal condition: return the best individuals. forward a model using NSGA2 framework. Dataset Symbol Vertices Edges  Overlapping Model [28]: A consideration Karate K 34 78 was taken that a node that connects multiple communities with similar strength may be Dolphin D 62 159 fuzzy. For example, if a node i has l links to PolBooks P 105 441 both community A and B then i must be a Football F 115 613 fuzzy node. Jazz J 198 2742  Dynamic Model: Dynamic model takes into Sawmill S 36 62 consideration evolving networks. This is a LesMis L 77 254 model which can be used in case some of the Words W 112 425 nodes or edges disappear. Dynamic Metabolic M 453 2025 networks help in predicting change Table 2: Description of the datasets used tendency. Thus community detection is We proposed the following two changes in GAFCD challenging in dynamic networks. In 2010, [30] to improve the final modularity value and NMI Folino and Pizzuti used Temporal value for the fuzzy community detected: Smoothness Framework, i.e. they ignored  While calculating the modularity value (Q-value), we changes in the community for a short period calculated the contribution of each community

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 55 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks separately, while also maintaining the combined Q- to indicate the frequency with which they discussed value of each individual of the population. Q value work matters with each of their colleagues on a 5 was given by the trace of a c*c matrix, where point scale ranging from less than once a week to c=number of communities. The matrix was given by several times a day. Two employees were linked in U*B*U'. So for all of the c communities present in the communication network if they rated their contact this matrix, we stored the diagonal values in a vector as three or more. We do not know whether both called Q per community. employees had to rate their tie in this way or that at  In the fuzzy crossover function, after applying least one employee had to indicate strength of three Roulette wheel selection for calculating the optimal or more [35]. number of communities in the crossover child, random selection of individual communities was done 7)LESMIS: Co-appearance network in the novel from the union of the communities of the two parents. LesMis [36]. Instead of doing a random selection in this step, we used the Q per community vector calculated above to 8) WORDS: Adjacency network of common select the individual communities from the union. We adjectives and nouns in the novel David Copperfield applied Roulette wheel selection in this step. by Charles Dickens [37]. Experiment analysis The experiment was conducted on Microsoft 9) METABOLIC: KEGG Metabolic pathways can be Windows 10 Home Single Language ©2015 64 bit realized into network. Substrate or Product compound operating system using a MATLAB 11 programming are considered as Node and genes are treated as edge platform with Intel (R) Core-i5 1.70 GHz processor [38]. and 8.0 GB RAM. Datasets description IV. EXPERIMENTAL RESULT & ANALYSIS In this experiment, we have to use the number of well known datasets in the form of adjacency matrix. All We compare MGAFCD with GAFCD, MSFCM and the employed dataset description is given below. GALS on 10 real-world data sets that are described in 1) KARATE: This dataset is about study of a karate Table II. Metabolic Network is an undirected, club network by Zachary. The network consists of 34 weighted graph, but it has 15 loops or self- members of a karate club as nodes and 78 connections (none of the algorithms here can handle connections among members representing friendships these loops). Here, we simply remove these loops to in the club which was observed over a period of two make Metabolic Network a simple graph. Karate and years [31]. LesMis datasets are weighted and undirected, while all the other data sets are undirected and unweighted. 2) DOLPHIN: Bottlenose dolphin’s network study of . The different steps involved in SGA are: dolphins living in Doubtful Sound, New Zealand is  Initialization: Before evolution, a population of also used for evaluating communities. The network individuals are randomly initialized. was divided into two groups depending on the  Fitness Evaluation: In every iteration, the association patterns of dolphins [32]. competitiveness of individuals is first evaluated on the basis of a quality function and a fitness 3) POLBOOKS: A network of books about recent US score is assigned to each individual by this politics sold by the online bookseller Amazon.com. quality function. m= 1.7, npc=10. 10 partitions Edges between books represent frequent co- with each community size from cmin to cmax are purchasing of books by the same buyers. The network generated and taken as single individuals. was compiled by V. Krebs and is unpublished, but Population size=npc*(cmax-cmin+1) can found on Krebs' web site [33].  Survival of the Fittest: Individuals are selected for crossover and mutation with pre-define 4) FOOTBALL: A dataset showing the fixtures, probabilities pc and pm respectively. pc= 0.9, results and attendance of football games played by pm=1.0, cp=0.1 Leeds United football teams [39].  Evolution: The selection process guarantees that an individual with a higher fitness score will be 5) JAZZ: This is the collaboration network between chosen with a higher probability. Jazz musicians. Each node is a Jazz musician and an  Iteration: After a new generation is produced, edge denotes that two musicians have played together SGA terminates and returns the best individual in a band [34]. of the current generation if some stopping conditions are satisfied. Number of 6) SAWMILL: This is a communication network iterations=100 within a small enterprise: All employees were asked

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 56 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks

Figure 3: Jazz, Q=0.4452, c=4 Figure 1: Dolphin, Q=0.5285, c=5 Figure 4: Metabolic, Q=0.4447, c=9 Figure 2: Karate, Q=0.4449, c=4 nodes respectively. This partition gave a Q In this paper, Fig 1,2,3,4 represents fuzzy community value of 0.4449. partitions of our given dataset or social network. In  Figure 3 represents partition of Jazz dataset. It these figures, relative sizes of each of the forms 4 communities with 53.4, 62, 21.6 and communities are shown. 61 nodes respectively. This partition gave a Q  Figure 1 represents partition of Dolphin value of 0.4452. dataset. It forms 5 communities with 12, 20, 9,  Figure 4 represents partition of metabolic 16 and 5 nodes respectively. This partition dataset. It forms 9 partitions with 36.75, 60, gives Q value as 0.5285. 44, 11, 74, 107.25, 7, 93 and 20 nodes  Figure 2 represents partition of Karate dataset. respectively. This partition shows a Q value of It forms 4 communities with 5, 6, 11 and 12 0.4447.

Algo. K D P F J S L W M 0.4129 0.3963 0.4596 0.5266 0.398 0.3279 0.4897 0.0052 0.2588 MSFCM

0.0001 0.0043 0.0009 0.0008 0.02 0.0001 0.0108 0.0013 0.0118

mean 0.4449 0.5285 0.5272 0.6046 0.4452 0.5501 0.5667 0.3107 0.4261 Q GAFCD

std Q 0 0.0001 0 0 0 0 0 0.0009 0.0014

0.4449 0.5282 0.5272 0.6045 0.4448 0.5501 0.5313 0.3094 0.4153 GALS 0.0004 0 0.0003 0.0001 0 0.0013 0.002 0.0068 0

0.4449 0.5285 0.5275 0.6046 0.4452 0.5503 0.5667 0.3107 0.4415 0 0 0 0 0 0 0 0.0005 0.005 MGAFCD

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 57 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks 0.4132 0.3991 0.4601 0.5268 0.4078 0.328 0.4971 0.0083 0.2876 MSFCM

3 4 3 10 4 5 5 9 7

Qbest 0.4449 0.5285 0.5272 0.6046 0.4452 0.5501 0.5667 0.3126 0.4287 GAFCD c 4 5 5 10 4 4 6 7 9

0.4449 0.5285 0.5272 0.6046 0.4449 0.5501 0.5439 0.3121 0.428 GALS

4 5 5 10 4 4 6 7 18

0.4449 0.5285 0.5275 0.6046 0.4452 0.5503 0.5667 0.3126 0.4447 MGAFCD 4 5 5 10 4 4 6 7 9

Table 3: Compared Performance Of Community Detection Algorithms

Table 3 shows the values that we compared between communities ranging from 2 to c1+c2. We calculated the MSFCM, GAFCD, GALS and our algorithm, average fitness for all the communities with a given GAFCD. It involves modularity values, i.e. Qbest, Qstd number of partitions. Then Roulette Wheel selection and Qmean. In that experiment, MGAFCD modularity was done to obtain the optimised number of values increased by an approximate factor of 0.02 in communities in the child. Then mutation was done the metabolic dataset and more datasets. We also which involved modifying each column of partition received improved values of Qstd in comparison to using qpip solver assuming that the other columns MSFCM, GALS algorithm for all datasets. For some remain the same. The modification we did proved of the smaller datasets like Dolphin, Qstd decreased effective for large dataset like metabolic as it used (and thus improved) in comparison to GAFCD. But, informed selection. It was not quite effective for for some of the bigger datasets like Metabolic, this smaller datasets as random selection and informed value increased, making the communities found a bit selection will select almost the same set of inconsistent, though with better modularity. In communities. Also, mutation operator will modify the datasets like Karate, Dolphin and Football, partition partition of smaller datasets easily thus eliminating found is crisp as was in the implementation of the need for informed selection. Whereas, in case of GAFCD. But for datasets like Jazz and Metabolic, large datasets, the modification increased the fuzzy communities are observed. GAFCD and modularity values and made a difference. We can MGAFCD has same number of communities for further modify the algorithm by including non- Metabolic dataset but different Q values. This is in mutated partitions in the set as well. In the future, we consistence with the fact that we have selected the will put our efforts to enable our GAFCD workable optimal number of communities in the way similar to for large social networks. With the assumption that the way GAFCD did. But, we have improved the large social networks are usually sparse graphs, we algorithm in selection of communities that form the will attempt to reduce the time cost for computing Qg next generation individual. Thus, it shows same for a fuzzy partition. Meanwhile, we will work number of communities but different modularity towards a new effective and but more efficient value. algorithm to replace the current mutation operator.

CONCLUSION & FUTURE WORK REFERENCES

We have a successfully modified the existing [1] Wasserman, Stanley; Faust, Katherine (1994). "Social Network Analysis in the Social and Behavioral GAFCD algorithm. The existing GFCD algorithm did Sciences". Social Network Analysis: Methods and the following: It made a fuzzy partition of the Applications. Cambridge University Press. pp. 1– network using one step FCM initialization. It treated 27.ISBN 9780521387071. each partition as an individual. The modularity value [2] S. Fortunato, “Community detection in graphs,” Physics Reports, vol. 486, pp. 75–174, 2010. for the partition was used as the objective function to [3] M. E. J. Newman and M. Girvan, “Finding and evaluating evaluate each partition. These partitions were then community structure in networks”, Phys. Rev. E 69, 026113, sorted according to these modularity values. Then 2004. certain percentage of individuals was directly selected [4] U. Brandes, D. Delling, M. Gaertler, R. Goerke, M. Hoefer, Z. Nikoloski and D. Wagner, “Maximizing modularity is for the next generation. Next, crossover was done. In hard”, physics/0608255 in www.arxiv.org. this we combined the two parents. Suppose parent 1 [5] M. E. J. Newman, “Fast algorithm for detecting community has c1 communities and parent 2 has c2 communities. structure in networks”, Physical Review E 69, 066133, 2004. Then the child made can have the number of

Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 58 Improviesed Genetic Algorithm for Fuzzy Overlapping Community Detection in Social Networks [6] A. Clauset, M. E. J. Newman and C. Moore, “Finding "Multi-resolution modularity methods and their limitations in community structure in very large networks”, Phys. Rev. E community detection". European Physical Journal B 85 (10): 70, 066111, 2004. 1–10. doi:10.1140/epjb/e2012-30301-2 [7] R. Guimer`a, M. Sales-Pardo and L. A. N. Amaral, [24] Goldberg, D.E. (1989). Genetic Algorithms in Search, “Modularity from fluctuations in random graphs and complex Optimization and Machine Learning. Kluwer Academic networks”, Phys. Rev. E 70, 025101(R), 2004. Publishers. ISBN 0-201-15767-5. [8] R. Guimer`a and L. A. N. Amaral, “Functional cartography [25] Yang, X. S. (2009). "Firefly algorithms for multimodal of complex metabolic networks”, Nature 433, pp. 895-900, optimization". Stochastic Algorithms: Foundations and 2005. Applications, SAGA 2009. Lecture Notes in Computer [9] J. Reichardt and S. Bornholdt, “Statistical mechanics of Sciences 5792. pp. 169–178. arXiv:1003.1466 community detection”, Physical Review E 74, 016110, 2006. [26] Original Harmony Search: Geem ZW, Kim JH, and [10] J. Duch and A. Arenas, “Community detection in complex Loganathan GV, A New Heuristic Optimization Algorithm: networks using extremal optimization”, Phys. Rev. E 72, Harmony Search, Simulation, 2001. 027104, 2005. [27] Gottlob, Georg; Nicola Leone; Francesco Scarcello (2000). [11] M. E. J. Newman, “Modularity and community structure in "A comparison of structural CSP decomposition networks”, Proc. Natl. Acad. Sci. USA 103, pp. 8577–8582, methods". Artificial Intelligence 124 (2): 243– 2006. 282.doi:10.1016/S0004-3702(00)00078-3 [12] Pizzuti, “Community detection in social networks with [28] Aliprantis, Charalambos D.; Brown, Donald J.; Burkinshaw, genetic algorithms,” in Proceedings of the 10th annual Owen (April 1988). "5 The overlapping generations model conference on Genetic and . ACM, (pp. 229–271)". Existence and optimality of competitive 2008, pp. 1137–1138. equilibria (1990 student ed.). Berlin: Springer-Verlag. [13] John Holland, Adaptation in Natural and Artificial Systems pp. xii+284. ISBN 3-540-52866-0.MR 1075992. (1975), University of Michigan Press, ISBN 0-262- 58111-6 [29] Kraskov, Alexander; Stögbauer, Harald; Andrzejak, Ralph [14] Zhang et al. 2007a Identification of overlapping community G.; Grassberger, Peter (2003). "Hierarchical Clustering Based structure in complex networks using fuzzy c-means on Mutual Information". arXiv:q-bio/0311039 clustering. Physica A374, 483–490. [30] Jianhai Su and Timothy C. Havens, Fuzzy Community [15] Goemans, Michel X.; Williamson, David P. (1995), Detection in Social Networks Using a Genetic Algortihm. "Improved approximation algorithms for maximum cut and 2014 IEEE International Conference on Fuzzy Systems satisfiability problems using semidefinite (FUZZ-IEEE) programming", Journal of the ACM 42 (6): 1115– [31] W. W. Zachary, “An information flow model for conflict and 1145, doi:10.1145/227683.227684 fission in small groups,” Journal of Anthropological [16] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome (2009). Research, vol. 33, pp. 452– 473, 1977. "14.3.12 Hierarchical clustering". The Elements of Statistical [32] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Learning (PDF) (2nd ed.). New York: Springer. pp. 520– Slooten, and S. M. Dawson, “The bottlenose dolphin 528. ISBN 0-387-84857-6. Retrieved 2009-10-20. community of doubtful sound features a large proportion of [17] Newman, M. E. J. (2006). "Modularity and community long-lasting associations,” Behavioral Ecology and structure in networks". Proceedings of the National Academy Sociobiology, vol. 54, pp. 396–405, 2003. of Sciences of the United States of America 103 (23): 8577– [33] V. Krebs, “Books about U.S.A. politics.” [Online]. Available: 8696. arXiv:physics/0602124. Bibcode:2006PNAS..103.8577 http://www.orgnet.com/ [26] P. M. Gleiser and L. Danon, N. doi:10.1073/pnas.0601602103. PMC 1482622. PMID 167 “Community structure in jazz,” Adv. Comlex System, pp. 23398. 656–573, July 2003. [18] Kirkpatrick, S.; GelattJr, C. D.; Vecchi, M. P. (1983). [34] J. H. Michael and J. G. Massey, “Modeling the "Optimization by Simulated Annealing". Science 220 (4598): communication network in a sawmill,” Forest Products, vol. 671– 47, pp. 25–30, 1997. 680. Bibcode:1983Sci...220..671K.doi:10.1126/science.220.4 [35] D. E. Knuth, The Stanford GraphBase: a platform for 598.671. JSTOR 1690046. PMID 17813860 combinatorial computing. Addison-Wesley Reading, 1993, [19] Arias-Castro, E. and Chen, G. and Lerman, G. (2011), vol. 4. "Spectral clustering based on local linear [36] M. E. J. Newman, “Finding community structure in networks approximations.", Electronic Journal of Statistics 5: 1537– using the eigenvectors of matrices,” Phys. Rev. E, vol. 74, p. 1587, doi:10.1214/11-ejs651 036104, Sep 2006. [Online]. Available: [20] "Fuzzy Clustering". reference.wolfram.com. Retrieved 2016- http://link.aps.org/doi/10.1103/PhysRevE.74.036104 04-26. [37] J. Duch and A. Arenas, “Community detection in complex [21] Dinov, ID. "Expectation Maximization and Mixture networks using extremal optimization,” Phys. Rev. E, vol. 72, Modeling Tutorial". California Digital Library, Statistics p. 027104, Aug 2005. [Online]. Available: Online Computational Resource, Paper ttp://link.aps.org/doi/10.1103/PhysRevE.72.027104 EM_MM,http://repositories.cdlib.org/socr/EM_MM, [38] R. Guimera, L. Danon, A. D ` ´ ıaz-Guilera, F. Giralt, and A. December 9, 2008 Arenas, “Self-similar community structure in a network of [22] Andrea Lancichinetti and Santo Fortunato (2011). "Limits of human interactions,” Phys. Rev. E, vol. 68, p. 065103, Dec modularity maximization in community detection". Physical 2003. [Online]. Available: Review E 84: http://link.aps.org/doi/10.1103/PhysRevE.68.065103 066122. arXiv:1107.1155.Bibcode:2011PhRvE..84f6122L. d [39] M. Girvan and M. E. J. Newman. Community structure in oi:10.1103/PhysRevE.84.066122 social and biological networks. Proceedings of the National [23] Ju Xiang, Xin-Guang Hu, Xiao-Yu Zhang, Jun-Feng Fan, Academy of Sciences, 99:7821–7826, 2002. Xian-Lin Zeng, Gen-Yi Fu, Ke Deng and Ke Hu (2012).



Proceedings of 43rd The IRES International Conference, Bangkok, Thailand, 7th July 2016, ISBN: 978-93-86083-53-1 59