A Survey on Efficient and Scalable Method for Learning Collective
Total Page:16
File Type:pdf, Size:1020Kb
21705 Umesh B.Shingote et al./ Elixir Comp. Sci. & Engg. 67 (2014) 21705-21709 Available online at www.elixirpublishers.com (Elixir International Journal) Computer Science and Engineering Elixir Comp. Sci. & Engg. 67 (2014) 21705-21709 A survey on Efficient and Scalable method for Learning Collective Behavior Umesh B.Shingote* and Setu Chaturvedi Department of Computer Science & Engineering, Technocrats Institute of Technology, Bhopal, Madhya Pradesh, India. ARTICLE INFO ABSTRACT Article history: For the many people in world social networks are playing major role day to day life. Received: 27 January 2014; Depending on the user’s behavior and interaction with each other, the social networking Received in revised form: sites are reshaped. Growing interest and development of social network sites like Facebook, 7 February 2014; Twitter, Flicker, YouTube etc. imposing many research challenges. And hence this is Accepted: 13 February 2014; allowing researchers to do many research studies using data mining concepts. The main challenge of such online social networking websites is to find out the individuals behavior Keywords over social network. Understanding the user’s behavior on social networking websites is Social networks, called as collective behavior. There are many data mining techniques presented to identify Users Behavior, the behavior of individuals. Such methods of collective behavior allows to learn and predict Learning, the users online behavior and based on it assign the appropriate label to actor in network. Prediction of behavior, But the another main problem occurs in such methods is the networks scalability due to k-means variant, which this systems becomes poor in performance and many not be work if the network size Clustering. is too big. To overcome this problem we need to have scalable learning of collective behavior to deal with any size of social networks. Recently one such method presented, in this method an edge-centric clustering technique is presented to extract social network dimensions. With sparse social dimensions, the proposed approach can efficiently handle networks of any size. In this paper we are presenting the detailed discussion on this method. © 2014 Elixir All rights reserved Introduction normally of colossal size, involving hundreds of thousands of Social Media allowing more connectivity and interaction actors. The scale of these networks entails scalable learning of between web users and it encourages contributions and feedback models for collective behavior prediction. from anyone who is a member of any virtual community. Social To address the scalability issue, new method introduced an media will alter new mass cooperative behaviors that unlock the edge-centric clustering scheme to extract sparse social ability of the collective and deliver new ways to enterprise dimensions [1].With sparse social dimensions, the proposed results. Enterprises will use these collective behaviors because approach can efficiently handle networks of millions of actors the link between business worth and social media technologies. while demonstrating a comparable prediction performance to Enterprises will use them to look at a target community and other non-scalable methods. Social media facilitate people of all formulate new ways in which folks will move to realize walks of life to connect to each other. In this paper we are enterprise worth. In gift World Social media facilitate folks of presenting and discussing the all algorithms involved in this all walks of life to attach to every different. We tend to study study with its practical evaluation. however networks in social media will facilitate predict some Collective behavior refers to the behaviors of individuals in kinds of human behavior and individual preference. In social a social networking environment, but it is not simply the media, countless actors in an exceedingly network area unit the aggregation of individual behaviors. In a connected norm. With this large variety of actors, the scale cannot even be environment, individuals’ behaviors tend to be interdependent, command in memory, inflicting significant issue concerning the influenced by the behavior of friends. This leads to behavior quantifiability. As an example, standard content- sharing sites correlation between connected users. Take marketing as an like Del.icio.us, Flickr and YouTube enable users to transfer, tag example: if our friends buy something, there is a better-than- and comment differing types of contents (bookmarks, photos, average chance that we will buy it, too. This behavior videos).Users registered at these sites may become friends, an correlation can also be explained by homophile. The recent addict or follower of others. The prolific and swollen use of boom of social media enables us to study collective behavior on social media has flipped on-line interactions into a significant a a large scale. Here, behaviors include a broad range of actions: part of human expertise. joining a group, connecting to a person, clicking on an ad, In this work, we tend to study however networks in social becoming interested in certain topics, dating people of a certain media will facilitate predict some human behaviors and type, etc. In this work, we attempt to leverage the behavior individual preferences. Above all, given the behavior of some correlation presented in a social network in order to predict people in an exceedingly network, how can we infer the collective behavior in social media. In next section sections we behavior of other individuals in the same social network [1].This will discuss the algorithms those are presented to overcome the study will facilitate higher perceive activity patterns of users in limitations of existing methods by improving the network social media for applications like social advertising and scalability. Examples of Behavior, Joining a sports club, buying recommendation. A social-dimension-based approach has been some products, Becoming interested in a topic voting for a shown effective in addressing the heterogeneity of connections presidential candidate. presented in social media. But the networks in social media are Tele: E-mail addresses: [email protected] © 2014 Elixir All rights reserved 21706 Umesh B.Shingote et al./ Elixir Comp. Sci. & Engg. 67 (2014) 21705-21709 Material and Methodology • BiComponents is very efficient and scalable. There are many methods presented to learn and predict the Disadvantage: collective behavior of users over online social networks. In [1], • BiComponents gives output highly imbalanced communities. author presented all the related works and methods for learning • It fails to extract informative social dimension for collective behavior. Solution for the problem of scalability by classification. using the new data mining methods is provided. Below are some • This technique yields poor performance. of the methods presented previously [1].Classifications with Node Clustering (NodeCluster) networked instances are known as within-network classification, Note that social dimensions allow one actor to be involved or a special case of relational learning [2, 3]. But network tends in multiple affiliations. As a proof of concept, we also examine to present heterogeneous relations, and the Markov assumption the case when each actor is associated with only one affiliation. can only capture the local dependency. In [4, 5], there is model Essentially, we construct social dimensions based on node network connections or class labels based on latent groups. partition. A similar idea has been adopted in a latent group Similar idea is also adopted to differentiate heterogeneous model [21] for efficient inference. To be fair, we adopt k-means relations in a network by extracting social dimensions to clustering to partition nodes of a network into disjoint sets, and represent the potential affiliations of actors in a network [6]. convert the node clustering result into a set of social dimensions. Modularity Maximization (ModMax) Then, SVM is utilized for discriminative learning. For Some researchers presented methods to conduct soft convenience, this method is denoted as NodeCluster. clustering for graphs like modularity maximization [8]. Advantage Probabilistic methods are also developed [9, 10]. Special variant • SVM is used for discriminative learning. of modularity maximization is adopted to extract social Disadvantage dimension. Social dimension corresponds to the top • Each actor to be involved in only one affiliation, yielding in eigenvectors of a modularity matrix. A disadvantage with soft inferior performance than EdgeCluster. clustering such as modularity maximization is that the resultant Edge Clustering (EdgeCluster) social dimensions are dense, posing thorny computational Consequently, it is imperative to develop scalable Methods challenges for the extraction of extraction of social dimensions that can handle large-scale networks efficiently without and discriminative learning. The modularity maximization extensive memory requirements. Next, we elucidate on an edge- requires the computation of the top eigenvectors of a modularity centric clustering scheme to extract sparse social dimensions. matrix which is of size n × n where n is the number of actors in With such a scheme, we can also update the social dimensions a network. When the network scales to millions of actors, the efficiently when new nodes or new edges arrive. In a huge eigenvector computation becomes a daunting task. Though the network, large number