Prpl: a Decentralized Social Networking Infrastructure
Total Page:16
File Type:pdf, Size:1020Kb
PrPl: A Decentralized Social Networking Infrastructure Seok-Won Seong Jiwon Seo Matthew Nasielski Debangsu Sengupta Sudheendra Hangal Seng Keat Teh Ruven Chu Ben Dodson Monica S. Lam Computer Science and Electrical Engineering Departments Stanford University Stanford, CA 94305 ABSTRACT To be commercially viable, an advertisement-supported This paper presents PrPl, a decentralized infrastructure that social networking portal must attract as many targeted ad lets users participate in online social networking without impressions as possible. This means that this type of ser- loss of data ownership. PrPl, short for private-public, has a vice typically aims to encourage a network effect, in order to person-centric architecture–each individual uses a Personal- gather as many people’s data as possible. It is in their best Cloud Butler service that provides a safe haven for one’s interest to encourage users to share all their data publicly, personal digital assets and supports sharing with fine-grain lock this data in to restrict mobility, assume ownership of access control. A user can choose to run the Butler on a it, and monetize it by selling such data to marketers. Social home server, or use a paid or ad-supported vendor of his networking portals often either claim full ownership of all choice. Each Butler provides a federation of data storage; user data through their seldom-read end user license agree- it keeps a semantic index to data that can reside, possibly ments (EULA), or stipulate that they reserve the right to encrypted, in other storage services. It uses the standard, change their current EULA. In addition, a number of fac- decentralized OpenID management system, so users can use tors such as data lock-in and the exorbitant cost of running their established personas in accessing the data. large-scale centralized services all point to the likely estab- One pre-requisite to the success of a platform is the avail- lishment of an oligopoly, or even a monopoly. It is unsettling ability of applications, which means that ease of application that we leave the stewardship of all this personal data to a development is essential. We have developed a language for-profit company. Public outcry would be to no avail were called SociaLite, based on Datalog, that allows developers such a company to fail and sell its data assets. to use a simple declarative database query to access the large Our goal is to develop the technology that makes possi- collection of private data served up by the Butlers in our so- ble a decentralized, open and trustworthy social networking cial circle running under different administrative domains. infrastructure. This enables people worried about privacy We have developed a prototype of the PrPl infrastructure issues to participate in social networking without reserva- and implemented a number of simple social applications on tions. the system. We found that the applications can be written • Decentralized across different administrative do- in a small number of lines of code using SociaLite. Prelimi- mains. This allows users who keep data in different nary experimental results suggest that it is viable to enable administrative domains to interact with each other. sharing of private social data between close friends with a Users have a choice in services that offer different lev- decentralized architecture. els of privacy, amongst other metrics of quality. They may choose to store their data in personal servers they 1. INTRODUCTION own and keep in their homes, or entrust their data to paid or free ad-supported services. Furthermore, they 1.1 Decentralized, Open, and Trustworthy may wish to take advantage of a myriad of storage Social Networking services available by keeping their data, possibly en- crypted, in various locations. The current model in Research supported in part by the NSF POMI (Programmable which only users belonging to a common site may in- Open Mobile Internet) 2020 Expedition Grant 0832820, NSF teract is just as unacceptable as disallowing users on Grant TRUST CCF-0424422, an Amazon Web Services research different cellular networks to call each other. grant, and a Samsung scholarship. • Open API for distributed applications. We aim to create an API that allows a social application to run Permission to make digital or hard copies of all or part of this work for across different administrative domains. Note that the personal or classroom use is granted without fee provided that copies are OpenSocial API [1] is much less ambitious; its goal is to not made or distributed for profit or commercial advantage and that copies allow the same application to be run on different web bear this notice and the full citation on the first page. To copy otherwise, to sites separately; that is, each application just operates republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. on data wholly owned by a web portal. ACM Workshop on Mobile Cloud Computing & Services: Social Networks and Beyond (MCS) June 15, 2010, San Francisco, CA, USA • Trustworthy interactions with real friends. We wish Copyright 2010 ACM 978-1-4503-0155-8/10/06 ...$10.00. to create a safe haven for individuals to keep all of their Butler handles the underlying authentication and com- $ munications with the Personal-Cloud Butlers. It also provides common interfaces to specify authorization SociaLite Database Query Language and caching so that resources can be shared between multiple applications. Communicaon Authorizaon Authencaon Pocket Butler • Butler services. Besides providing an open API to the data, described in more detail below, the Butler also provides a web-based service that allows friends Data Semanc Access OpenID Friends’ to log in with their OpenID to enjoy data and services Friend’s Manager Index Control Manager Friend’s Butlers Butlers they are entitled to. Personal‐Cloud Butler Butlers How would Personal-Cloud Butlers be deployed? One Data Steward API possibility is to host such services on set-top boxes and home server facebook imap game consoles, which are already in many consumers’ homes. These devices already provide users a cost-effective and con- venient way to store their large-scale data such as full reso- Figure 1: The PrPl data subsystem lution photos. SociaLite Language. We have designed and imple- data without reservation and to share selected items mented a database query language called SociaLite to allow with different friends. This safe haven will enable new easy access to the large amount of data stored in a dis- applications since all the personal data are available tributed network of Butler services. SociaLite is an exten- in one place, and is more convenient for users because sion of Datalog, a declarative deductive database language. they do not have to upload them to different web sites. Supporting composition and recursion, this language is ex- pressive enough that many social applications can easily be 1.2 Contributions of this Paper written by adding a GUI to the result of a SociaLite query. We have created an architecture called PrPl, short for This language hides the complexities in distributing a query Private-Public, as a prototype of a decentralized, open and to the friends’ Butlers. trustworthy social networking system. Having such an op- Experimental Results. We have implemented a fully tion available may also put pressure on ad-supported social working prototype of the SociaLite language and PrPl in- networking sites to provide more guarantees on data privacy frastructure as proposed in this paper. A rich set of appli- and ownership. cations was developed alongside the infrastructure so as to Personal-Cloud Butlers. We propose the notion of a drive the design. PrPl applications are relatively easy to Personal-Cloud Butler, which is a personal service that we develop because they consist mainly of GUI code wrapped can trust to keep our personal information; it organizes our around SociaLite queries. Measurements of our prototype data and shares them with our friends based on our private running on a testbed consisting of 100 Butlers on Amazon preferences on access control. A high-level overview of the EC2 suggest that this approach is technically viable. Even Butler architecture is shown in Figure 1. though our prototype is not optimized, the first query results arrive within a couple of seconds. • Semantic index of personal data. It provides a unified index of the data to facilitate browsing and 2. FEDERATED ID MANAGEMENT searching of all personal information. To support in- The PrPl system utilizes federated, decentralized iden- teroperability, we represent our data in a standard for- tity management that enables secure logins, single sign-on, mat based on RDF and standard ontologies whenever and communication among applications in an environment they are available. where Butlers belong to different administrative domains. • Federated storage system. To take advantage of We wish to enable PrPl users to reuse existing credentials freely available data storage on the web, the Butler lets from multiple providers and avoid unnecessary ID prolifer- the user store their data, possibly encrypted, with dif- ation. Requirements for our identity management include ferent storage vendors if they wish. The Butler hands authenticating users to Butlers, registering Butlers with the out certificates that enable our devices and devices of Directory Service, third-party service authentication, and friends to retrieve the data directly from storage. authentication between Butlers and applications. To this end, we chose OpenID due to its position as an open stan- • Decentralized ID management. Our system al- dard (in contrast to Facebook Connect [2]), extensive library lows users to use their established personas by support- support, availability of accounts, and the ability to extend ing OpenID, a decentralized ID management system. the protocol easily for PrPl’s needs. We propose extending the OpenID system so that the An OpenID login consists of the following steps: OpenID provider supports the lookup of a designated 1.