Social Network Portability and Enhancement of the Origo Platform
Total Page:16
File Type:pdf, Size:1020Kb
Research Collection Master Thesis Social network portability and enhancement of the Origo platform Author(s): Weiss, Ulrich Andreas Publication Date: 2008 Permanent Link: https://doi.org/10.3929/ethz-a-005675938 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library Social Network Portability and Enhancement of the Origo Platform Ulrich Andreas Weiss 03-911-690 Master Thesis Chair of Software Engineering Department of Computer Science ETH Zürich Dr. Till G. Bay Prof. Bertrand Meyer March 2008 - August 2008 Abstract Social networking graphs are often secluded and encapsulated in one social network, unable to interact and connect to other social graphs. This phenomenon is often referred to as “walled gardens”, and that’s exactly what most popular social networks are. This master thesis shows an overview over the most important data portability standards and what should be taken care of to enhance the accessibility of the data and information, eventually connecting the social graphs together. These standards and concepts are analyzed and outlined critically, highlighting their privacy, authenticity and security implications. The thesis contains two sections, a theoretical part about data portability and social networks, and a practical part about enhancements to the Origo [1] Platform. ii Abstract Acknowledgment I thank the entire Origo team for all their help and commitment to the project. Special thanks go to my supervisor Till Bay and to Dominique Schneider and Dennis Rietmann for their great support with Origo. Further, I thank family and friends for their continuous encouragement throughout my studies and this thesis. iv Acknowledgment Contents Abstracti Acknowledgment iii 1 Social Network Portability1 1.1 Introduction...............................1 1.2 Data Portability.............................2 1.2.1 Some Standards.........................2 1.2.1.1 FOAF.........................2 1.2.1.2 OpenID........................4 1.2.1.3 Microformats.....................9 1.2.1.4 OAuth......................... 10 1.2.1.5 SIOC......................... 11 1.2.1.6 DOAP......................... 11 1.2.1.7 RSS and Atom.................... 12 1.3 Existing Web Services......................... 12 1.3.1 MySpace............................. 12 1.3.2 Facebook............................ 13 1.3.3 Google Friend Connect..................... 14 1.3.4 Google OpenSocial....................... 15 1.3.5 Google Social Graph API................... 15 1.3.6 Gravatar............................. 15 1.3.7 FriendFeed and Plaxo..................... 16 1.3.8 Proofile............................. 16 1.4 Conclusions............................... 16 1.4.1 Identity............................. 17 vi Contents 1.4.2 Authenticity........................... 17 1.4.3 Privacy............................. 17 1.4.4 Finding Friends......................... 18 2 Origo 19 2.1 Introduction............................... 19 2.1.1 Back-end............................ 19 2.1.1.1 Nodes......................... 20 2.1.2 Front-end............................ 21 2.2 Social Networking Integration..................... 21 2.2.1 Profile.............................. 22 2.2.1.1 Profile Import..................... 22 2.2.2 FOAF.............................. 24 2.2.3 OpenID............................. 25 2.2.4 SIOC............................... 25 2.2.5 DOAP.............................. 26 2.2.6 Future Work........................... 26 2.3 Origo Enhancements.......................... 27 2.3.1 Tooltips............................. 27 2.3.2 Password Recovery and Account Activation......... 29 2.3.3 Project Information....................... 30 2.3.4 Origo Maintenance and Minor Enhancements........ 32 2.4 Origo Issues............................... 34 2.4.1 Latency............................. 34 2.4.2 Application Programming Interface Development...... 36 2.4.3 Scalability and Redundancy.................. 37 2.5 Concluding Remarks.......................... 37 Bibliography 39 CHAPTER 1 Social Network Portability 1.1 Introduction The first chapter of this thesis is about portability of social networks and data portability in general. Some of the major technologies and services are introduced and analyzed. Historically, social networks have always been very centralized and secluded islands, walled gardens is what many people call them. These web sites bind users to their services by making it hard to leave, and by requiring visitors to register in order to interact with the current users. Friendships in a social network, and the real world for that matter, can be viewed as a graph of connections between real people. Every service requires their users to recreate their social graphs and user profiles over and over again, also, users’ online identities are not connected with each other in any way. For acquiring the possibility to connect and interact with their friends and acquaintances, users need to join numerous social networks, requiring a lot of inconvenient and repetitive manual input. Many have tried to tackle these problems, many incompatible standards have been defined and none have prevailed. This might change in the near future as some standards emerge, driven by the Data Portability Workgroup [2] and the companies supporting them. 2 1 Social Network Portability 1.2 Data Portability The Data Portability initiative was created to promote the idea that individuals maintain control over their data by determining how it can be used and accessed. 1 The main idea is that users should be able to control what data can be used by whom and in what manner. The group seeks to achieve these goals by promoting existing standards that enable data portability, not by developing new standards, rather by encouraging development of these standards and by identifying new standards that are required to fulfill the data portability vision. The vision is that data can be shared and remixed across the borders of web sites. The owner of the data should be enabled to control who has access to it, access that should not only be limited to the place where it has initially been uploaded. Application Programming Interfaces have been developed and are being used by the user community, e.g. by creating mashups of aggregated data. This is already a big step in the right direction, however it is not the end of the story and the solution to all the problems. 1.2.1 Some Standards I give a very brief introduction to some of the most important data portability and social networking technologies and standards, along with their implications. 1.2.1.1 FOAF Friend of a Friend [3] (FOAF) is a Resource Description Framework [4] (RDF) vocabulary designed to describe people and the connections between them. The RDF is a method for modeling information and meta-information in an XML [5] format. Abstract concepts or meta-models build the foundation of this mechanism, denoting the traits of subject to object relations. RDF is a major component of 1 Some parts quoted from <http://www.dataportability.org/> 1.2 Data Portability 3 W3C’s Semantic Web activity, trying to create, exchange and use machine-readable information in a distributed fashion. The main aspects of the FOAF specification are the user’s profile data, links to other people the user knows, and identities or accounts that the user holds. Listing 1.1 shows an example of what a FOAF file could look like. Listing 1.1: A sample FOAF file 1 <rdf:RDF 2 xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 3 xmlns:foaf="http://xmlns.com/foaf/0.1/"> 4 5 <foaf:Person> 6 <foaf:name>Ueli Weiss</foaf:name> 7 <foaf:mbox_sha1sum>937086423 dcf54b784ec740aea134dfe4e879828</foaf:mbox_sha1sum> 8 <foaf:openid>http://uweiss.myopenid.com/</foaf:openid> 9 10 <foaf:holdsAccount> 11 <foaf:OnlineAccount> 12 <foaf:accountServiceHomepage rdf:resource="http:// origo.ethz.ch"/> 13 <foaf:accountProfilePage rdf:resource="http://www. origo.ethz.ch/users/uweiss"/> 14 <foaf:accountName>uweiss</foaf:accountName> 15 </foaf:OnlineAccount> 16 </foaf:holdsAccount> 17 18 <foaf:knows> 19 <foaf:Person> 20 <foaf:mbox_sha1sum rdf:resource="937086423 dcf54b784ec740aea134dfe4e879828" /> 21 <rdfs:seeAlso rdf:resource="http://www.some-url.to/a- foaf-file"/> 22 <foaf:name>John Doe</foaf:name> 23 </foaf:Person> 24 </foaf:knows> 25 </foaf:Person> 4 1 Social Network Portability 26 </ rdf:RDF > This effectively establishes a decentralized directed graph of connections between people. The mbox_sha1sum value is an SHA1 hash of the person’s e-mail address, which can be used as a unique identifier, same as the OpenID (cf. 1.2.1.2) identifies a person. These graphs can be crawled, or queried on one of the existing services. The decentralized nature of this makes it possible for anyone to add bogus entries to the network or claim identities that are not actually theirs. This phenomenon has been observed on the web for many years, with search engine bombing, domain squatting, bogus web page networks being only some of the many examples. RDF search engines will have to deal with these problems, just like regular search engines deal with them on the web currently, assuming the semantic web ever gains enough popularity to attract spammers, or other malicious entities. The first that comes to mind are blacklists. Blacklisting domains or documents is easy, but how do we assure that only malicious content is denied? If spammers misuse a social network and generate bogus