Approaches for User Profile Investigationin Orkut Social Network
Total Page:16
File Type:pdf, Size:1020Kb
(IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 Approaches for user profile Investigation in Orkut Social Network Rajni Ranjan Singh, Deepak Singh Tomar Maulana Azad National Institute of Technology (MANIT) Bhopal, India. [email protected], [email protected] Abstract- Internet becomes a large & rich repository of approach guesses that the more similar a person is, the more information about us as individually. Any thing form user profile likely they are to be a friend. Similarity is measured by information to friends links the user subscribes to are reflection analysing profile information, mutual friends, and mutual of social interactions as user has in real worlds. Social communities. If we are trying to evaluate the likelihood that networking has created new ways to communicate and share user A is linked to user B, we sum the number of items the information. Social networking websites are being used regularly by millions of people, and it now seems that social networking two users have in common. Similarity between profiles will be an enduring part of everyday life. Social networks such as reflects closeness and interaction between users .We only Orkut, Bebo, MySpace, Flickr, Facebook, Friendster and considered direct friends to the candidates for matching. LinkedIn, have attracted millions of internet user who are Computing the similarity score for individual to all others involved in bogging, participatory book reviewing, personal friends in direct links, and rank the others according to their networking and photo sharing. Social network services are similarity score. We expect some friends to be more similar increasingly being used in legal and criminal investigations. than others. The rest of the paper is organized as follows. In Information posted on sites such as Orkut and Facebook has section 2 backgrounds in which we discuss about Orkut been used by police, probation, and university officials to networks, user profiles, and friends networks In Section 3 we prosecute users of said sites. In some situations, content posted on web social network has been used in court. In the proposed discuss approaches for user profile investingation Section 4 work degree of closeness is identified by link weight approaches illustrate proposed framework for relation identification on the and information matrices are generated and matched on the basis of profile similarity. Section 5 shows Experimental basis of similarity in user profile information. The proposed results, Section 6 discuss Challenges, Section 7 Conclusion technique is useful to investigate a user profile and calculate and Section 8 shows References used in this paper. closeness /interaction between users. II. BACKGROUND Keywords— Social networks, similarity measure, User profile, web communities, link analysis. A. Orkut an Overview Orkut is our topic of interest is a free-access social networking I. INTRODUCTION service owned and operated by Google. The service is Orkut is one of the earlier & most famous web social designed to help users met new friends and maintain existing networks run by google plays an important role to relationships It is one of the most visited websites in India and communicate and share private and public information in web Brazil. environment facilitate bogging (scraping), personal B .Crime over Orkut networking, photo sharing, chatting, private messing, friend search. An interesting part of Orkut SNS is that user can see Now Orkut social N/W are targeted by criminals and terrorist not only others profile information but also others friends to spread wrong information plan blast recently many networks. Recent work has attempted to find of web pages abduction terrorist activity are noticed by cyber police, now communities by performing analysis on their graph structure terrorist use the internet and tools like E-mail, Chat and social [1], Mining Directed Social Network from Message Board [2], N/wing sites to plant terrorist attack. Evaluating Similarity measure in Orkut networks [3]. C. User profile: User profile is an individual user home page Discover behaviour of Turkish people in Orkut[4], trust based consists of mainly. recommendation system.our work focuses on individuals' homepages and the connections between them we can now (1) Social, professional, personal information. use it to characterize relationships between people. Beyond (2) Links to other friends profile called friend list. developing the interface, we quantitatively evaluated the (3) Communities matchmaking approach for all kinds of information about the (4) Photograph of user. user. To predict whether one person is a friend of another & (5) Scrapbook how much closeness both has, we rank all users by their (6) Photo album similarity to that person. Intuitively, our matchmaking 259 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 Fig. 2 Communities TABLE 1 III. APPROACHES FOR USER PROFILE INVESTIGATION SITE FEATURES OF ORKUT SOCIAL NETWORKS A) Identifying connection Site Feature Orkut How two profiles are connected and how much closeness both Profiles Publicly viewable profile has, this approaches is used to detect the connection between Advertisement Yes two or more suspicious profiles means how two criminals are Interface Simply and easy to Understand connected in web environments. Chat Yes, Google chat Users in Web social network visualize as a node and link Search Yes, Google search between users reflect relationship. A user has connection with Customizable No their friends by maintaining friend list consist of links of all Online/offline communication Support multiple language friends profiles. (Orkut scrapbook) Our initial approach to link identification consisted of dividing Friends rating and profile view Yes u can rate your friends friend’s network features into graph features [3]. D. Friends Networks from User Profiles. The friend’s network of Orkut, our topic of study, has two varieties of accounts: users and communities. TABLE 2 USER, COMMUNITY, LINK RELATIONSHIP Start End Link Denotes User User Trust or friendship User Community Readership or Figure 3 Friends/neighbours Subscribership 1. in degree of u: popularity of the user Community User Membership, Posting access, maintainer 2. in degree of v: popularity of the candidate Community Community Obsolete 3. Out degree of u: number of other friends besides the Candidate; saturation of friends list 1). Friendlist: - A user has connection with their friends by 4. Out degree of v: number of existing friends of the maintaining friendlist consist of links of all friends profiles Candidate besides the user; Correlates loosely with reflects user social relation. User explicitly adds friends by Likelihood of a reciprocal link accepting friend’s request. 5. Number of mutual friends w such that u → w w → v 6. “Forward deleted distance“: minimum alternative Distance from u to v in the graph without the edge (u, v) 7. Backward distance from v to u in the graph These were supplemented by interest-based features: 8. Number of mutual interests between u and v 9. Number of interests listed by u 10. Number of interests listed by v 11. Ratio of the number of mutual interests to the number Listed by u 12. Ratio of the number of mutual interests to the number Fig. 1 Friend list Listed by 2) Communities: - Community is a group of user profiles 13-path length: number of links (edges) between u and v in a share common interest. Anyone with an Orkut account can given path. create a community on anything. 14 hop count: number of vertices (users) between u and v in a given path. 1) Investigate user profile findout the strong connection to other profiles Extract friends profiles belongs to same cities, education ,college etc. 260 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol.6, No. 2, 2009 Figure 4: A message chain of messages sent by three individuals. 2) On the basis of profile similarity. 2.1) On the basis of contents uniqueness 2) Investigate the relationship link between two or more 2.2) On the basis of contents similarity profiles (used to find-out connection between suspicious profiles) 2.1) on the basis of contents uniqueness Similarity is measured by analyzing text, links. If we are trying to evaluate the likelihood that user A is linked to user B, we sum the number of items the two users have in common. Items that are unique to a few users are weighted more than commonly occurring items. The weighting scheme we use is the inverse log frequency of their occurrence. For example, if only two people mention an item, then the weight of that item is 1/log(2) or 1.4.[1] W is the closeness/interaction level B) Closeness identification W can be calculated by one of the two methods It is possible with this algorithm to evaluate each shared item type 1) On the basis of communication independently (i.e. links, mailing lists, text) or to combine them 2) On the basis of profile similarity. together into a single likeness score. 1) O n the basis of communication 2.2) on the basis of contents similarity In a social network based upon online communication, the distance between individuals does not mean `geographical distance' because Propose work focus on contents based similarity. each person lives in a virtual world. Instead, distance can be considered `psychological distance' and this can be measured by the IV.PROPOSED WORK: SIMILARITY MEASUREMENT influence" wielded among the members of the network. Consider the BASED ON CONTENTS SIMILARITY situation where an individual p has a great deal of influence on an individual q[2]. Similarity between profiles reflects closeness and In this case, we can consider three types of relationship. interaction between users. Similarity is measured by profile Case1: p is close to q.