On Social-Temporal Group Query with Acquaintance Constraint

On Social-Temporal Group Query with Acquaintance Constraint De-Nian Yang Yi-Ling Chen Wang-Chien Lee Ming-Syan Chen Academia Sinica National Taiwan Univ. The Penn State Univ. National Taiwan Univ. [email protected] [email protected] [email protected] [email protected] ABSTRACT tion, nevertheless, an activity initiator still needs to manually de- Three essential criteria are important for activity planning, includ- cide the suitable time and desirable attendees for activity planning, ing: (1) finding a group of attendees familiar with the initiator, which can be very tedious and time-consuming. Thus, it is desir- (2) ensuring each attendee in the group to have tight social rela- able to provide an efficient activity planning service that automati- tions with most of the members in the group, and (3) selecting an cally suggests the attendees and time for an activity. activity period available for all attendees. Therefore, this paper To support the aforementioned service, we formulate a new query proposes Social-Temporal Group Query to find the activity time problem, named Social-Temporal Group Query (STGQ), which con- and attendees with the minimum total social distance to the initia- siders the available time of candidate attendees and their social re- tor. Moreover, this query incorporates an acquaintance constraint to lationship. Given an activity initiator, we assume that friends on her avoid finding a group with mutually unfamiliar attendees. Efficient social network (i.e., her friends together with friends of friends) are processing of the social-temporal group query is very challenging. candidate attendees. We also assume that the available time slots We show that the problem is NP-hard via a proof and formulate of the candidate attendees have been made available for the system in charge of query processing (e.g., via web collaboration tools the problem with Integer Programming. We then propose two ef- 1 ficient algorithms, SGSelect and STGSelect, which include effec- with the corresponding privacy setting ), and the social relation- tive pruning techniques and employ the idea of pivot time slots to ship between friends is quantitatively captured as social distance substantially reduce the running time, for finding the optimal so- [10, 12, 13], which measures the closeness between friends. An lutions. Experimental results indicate that the proposed algorithms STGQ comes with the following parameters, (1) the activity size, are much more efficient and scalable. In the comparison of solu- denoted by p, to specify the number of expected attendees, (2) the tion quality, we show that STGSelect outperforms the algorithm length of time for the activity, denoted by m, (3) a social radius that represents manual coordination by the initiator. constraint, denoted by a number s, to specify the scope of candidate attendees, and (4) an acquaintance constraint, denoted by a number k, to govern the social relationship between attendees. 1. INTRODUCTION The primary goal is to find a set of activity attendees and a suit- Three essential criteria are important for activity initiation, in- able time which match the specified number of attendees in (1) cluding: (1) finding a group of attendees familiar with the initiator, and the activity length in (2) respectively, such that the total so- (2) ensuring each attendee in the group to have tight social relations cial distance between the initiator and all invited attendees is min- with most of the members in the group, and (3) selecting an activ- imized. Additionally, the social radius constraint in (3) specifies ity period available for all attendees. For example, if a person has that all the candidate attendees are located no more than s edges a given number of complimentary tickets for a movie and would away from the initiator on her social network, while the acquain- like to invite some friends, she usually prefers choosing a set of tance constraint in (4) requires that each attendee can have at most mutually close friends and the time that all of them are available. k other unacquainted attendees.2 As such, by controlling s and k Nowadays, most activities are still initiated manually via phone, e- based on desired social atmosphere, suitable attendees and time are mail, messenger, etc. However, a growing number of systems are returned. Example 1 in the Appendix A shows an illustrative exam- able to gather and make available some information required for ple. Note that, in situations where the time of the planned activity is activity initiation. For example, social networking websites, such pre-determined (e.g., complimentary tickets for a concert or a sport as Facebook [2] and LinkedIn [5], provide the social relations and event in a specified day), the aforementioned STGQ can be simpli- contact information, and web collaboration tools, such as Google fied as a Social Group Query (SGQ) (i.e., without considering the Calendar [3], allow people to share their available time to friends temporal constraint). In this paper, we first examine the processing and co-workers. Even with the availability of the above informa- 1To process STGQ, it is not necessary for a user to share the sched- Permission to make digital or hard copies of all or part of this work for ule to friends. However, we assume that any friend can initiate an personal or classroom use is granted without fee provided that copies are STGQ, and the query processing system can look up the available not made or distributed for profit or commercial advantage and that copies time of the user, just like the friend making a call to ask the avail- bear this notice and the full citation on the first page. To copy otherwise, to able time. Different privacy policies or different schedules can be republish, to post on servers or to redistribute to lists, requires prior specific set for different friends, just like answering different available time, permission and/or a fee. Articles from this volume were invited to present or even not answering, for various friends. The above privacy and schedule setting is beyond the scope of this paper. their results at The 37th International Conference on Very Large Data Bases, 2 August 29th - September 3rd 2011, Seattle, Washington. Note that the social radius constraint is specified in terms of num- Proceedings of the VLDB Endowment, Vol. 4, No. 6 ber of edges, rather than the social distance, so the users can easily Copyright 2011 VLDB Endowment 2150-8097/11/03... $ 10.00. specify the constraint. 397 strategies for SGQ and then take further steps to extend our study optimal solutions to SGQ and STGQ, respectively. We de- on the more complex STGQ. rive an Integer Programming optimization model for STGQ, For SGQ, a simple approach is to consider every possible p at- which can also support SGQ. Moreover, we devise various tendees and find the total social distance. This approach needs to strategies, including access ordering, distance pruning, ac- f−1 evaluate Cp−1 candidate groups, where f is the number of can- quaintance pruning, pivot time slots, and availability prun- didate attendees. In the current social network [8], the average ing, to prune unnecessary search space and obtain the op- number of friends for a user is around 100. Therefore, if an ini- timal solution in smaller time. Our research result can be tiator would like to invite 10 attendees out of her 100 friends, the adopted in social networking websites and web collaboration number of candidate groups is in the order of 1013. Note that the tools as a value-added service. above scenario considers only the friends of the user as candidate attendees, and the number of candidate groups will significantly The rest of this paper is summarized as follows. In Section 2, increase if the friends of each friend are also included. To sequen- we introduce the related works. Section 3 formulates SGQ and ex- tially choose the attendees at each iteration, giving priority to close plains the details of Algorithm SGSelect. Section 4 extends our friends based on their social distances to the activity initiator may study on SGQ to the more complex STGQ. The details of Algo- lead to a smaller total social distance in the beginning, but does not rithm STGSelect are also included. We present the experimental always end up with a solution that satisfies the acquaintance con- results in Section 5 and conclude this paper in Section 6. straint, especially for an activity with a small k. On the other hand, giving priority to a set of people who know each other to address 2. RELATED WORKS the acquaintance constraint does not guarantee to find the solution There is a tremendous need for activity planning. Even though with the minimum total social distance. Therefore, to select each some web applications, e.g., Meetup [6], have been developed to attendee, the challenge comes from the dilemma of (1) reducing support activity coordination, these applications still require users the total social distance at each iteration and (2) ensuring that the to manually assign the activity time and participants. For example, solution eventually follows all constraints of this problem. using Meetup [6] or the Events function in Facebook [1], an activ- Based on the above observations, in this paper, we first prove ity initiator can specify an activity time and manually select some that SGQ is NP-hard and then propose an algorithm, called SGSe- friends to send invitations to. In response, these friends inform the lect, to find the optimal solution. We design three strategies, access activity initiator whether they are able to attend or not. Obviously, ordering, distance pruning, and acquaintance pruning, to effectively the above-described manual activity coordination process is tedious prune the search space and reduce the processing time.

Load more