Semantic Web 1 (2012) 1–16 1 IOS Press

An Architecture of a Distributed Semantic

Sebastian Tramp a,∗, Philipp Frischmuth a, Timofey Ermilov a, Saeedeh Shekarpour a and Sören Auer a a Universität Leipzig, Institut für Informatik, AKSW, Postfach 100920, D-04009 Leipzig, Germany E-mail: {lastname}@informatik.uni-leipzig.de

Abstract. Online social networking has become one of the most popular services on the Web. However, current social networks are like walled gardens in which users do not have full control over their data, are bound to specific usage terms of the social network operator and suffer from a lock-in effect due to the lack of interoperability and standards compliance between social networks. In this paper we propose an architecture for an open, distributed social network, which is built solely on Semantic Web standards and emerging best practices. Our architecture combines vocabularies and protocols such as WebID, FOAF, Se- mantic Pingback and PubSubHubbub into a coherent distributed semantic social network, which is capable to provide all crucial functionalities known from centralized social networks. We present our reference implementation, which utilizes the OntoWiki application framework and take this framework as the basis for an extensive evaluation. Our results show that a distributed social network is feasible, while it also avoids the limitations of centralized solutions.

Keywords: Semantic Social Web, Architecture, Evaluation, Distributed Social Networks, WebID, Semantic Pingback

1. Introduction platform or information will diverge. Since there are only a few large social networking players, the Web Online social networking has become one of the also loses its distributed nature. According to a recent most popular service on the Web. Especially comScore study1, Facebook usage times already out- with its 600M+ million users creates a web inside the number traditional Web usage by factor two and this Web. Drawing on the metaphor of islands, Facebook divergence is continuing to increase. is becoming more like a continent. However, users are We argue that social networking must be distributed locked up on this continent with hardly any opportu- and users should be empowered to regain control over nity to communicate easily with users on other islands their data. The currently vast oceans between social and continents or even to relocate trans-continentally. networking continents and islands should be bridged Users are bound to a certain platform and hardly have by high-speed connections allowing data and users to the chance to migrate easily to another social network- travel easily and quickly between these places. In fact, ing platform if they want to preserve their connections. we envision the currently few social networking conti- Once users have published their personal information nents to be complemented by a large number of smaller within a social network, they often also lose control islands with a tight network of bridges and ferry con- over the data they own, since it is stored on a single nections between them. company’s servers. Interoperability between platforms In this paper we describe the main technological is very rudimentary and largely limited to proprietary ingredients for a distributed semantic social network APIs. In order to keep data up-to-date on multiple plat- (DSSN) as well as their interplay. The semantic rep- forms, users have to modify the data on every single resentation of personal information is facilitated by a

*Corresponding author. 1http://allthingsd.com/20110623/ WebID: http://sebastian.tramp.name the-web-is-shrinking-now-what/

1570-0844/12/$27.50 c 2012 – IOS Press and the authors. All rights reserved 2 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

WebID profile. The WebID protocol allows the use of Service Decoupling. A second fundamental design a WebID profile for authentication and access control principle is the decoupling of user data from services purposes. Semantic Pingback facilitates the first con- as well as applications [8]. It ensures that users of the tact between users of the social network and provides network are able to choose between different services a method for communication about resources (such and applications. In addition, this principle helps users as images, status messages, comments, activities) on of the DSSN to distinguish between their own data, the social network. Finally, PubSubHubbub-based sub- which they share with and license to other people and scription services allow obtaining near-instant notifica- services, and foreign data, which they create by using these services and which they do not own. tions of specific information as WebID profile change sets and activity streams from people in one’s so- Protocol Minimalism. The main task for social net- cial network. Combined these standards, protocols and working protocols is to communicate RDF triples be- technologies provide all necessary ingredients to re- DSSN nodes, not to enforce a specific work flow alize a distributed social network having all the cru- nor an exact interpretation of the data. This constraint cial social networking features provided by centralized ensures the extensibility of the data model and keeps ones. the overall architecture clean. This article is structured as follows: We present our 2.2. Data Layer reference architecture of a distributed social semantic network in Section 2. Our implementation based on The data layer comprises two main data structures: the OntoWiki framework is demonstrated in Section 3, resources for the description of static entities and feeds while the evaluation results of our approach are dis- for the representation and publishing of events and ac- cussed in Section 4. Finally, we give an overview of re- tivities. lated work in Section 5 and conclude with an outlook 2.2.1. Resources on future work in Section 6. We distinguish between three main categories of DSSN resources: WebIDs for persons as well as appli- cations, data artefacts and media artefacts. The prop- 2. Architecture of a Distributed Semantic Social erties, conditions and roles in the network of these re- sources are described in the next paragraphs. Network WebID [17]2 is a best practice recently conceived in In this section we describe the DSSN reference ar- order to simplify the creation of a digital ID for end chitecture. After introducing a few design principles users. Since its focus lies on simplicity, the require- ments for a WebID profile are minimal. In essence, on which the architecture is based, we present its dif- a WebID profile is a de-referenceable RDF document ferent layers, i.e. the data, protocol, service and appli- (possibly even an RDFa-enriched HTML page) de- cation layers. The overall architecture is depicted in scribing its owner3. That is, a WebID profile con- Figure 1. tains RDF triples which have the IRI identifying the owner as subject. The description of the owner can be 2.1. Basic Design Principles performed in any (mix of) suitable vocabulary (-ies), but FOAF [6] has emerged as the ‘industry standard’ Our DSSN architecture is based on the following for that purpose. An example WebID profile compris- three design principles. ing some personal information (lines 8-12) and two rel:worksWith4 links to co-workers (lines 6-7) is Linked Data. The main protocol for data publishing, shown in Listing 1. retrieval and integration is based on the Linked Data principles [3]. All of the information contained in the 2The latest spec is available at http://webid.info/spec/. DSSN is represented according to the RDF paradigm, 3The usage of an IRI with a fragment identifier allows the indi- made de-referencable and interlinked with other re- rect identification of an owner by reference to the (FOAF) profile sources. This principle facilitates heterogeneity and document. 4Taken from RELATIONSHIP: A vocabulary for describing enables the distribution of data and services on the relationships between people at http://purl.org/vocab/ Web. relationship. S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 3

Profile Bookmark Foto ... Manager Collection Sharing Application Layer 4 ping 2 create update 5 search push subscribe 7 3 delegeate access to Ping Update Search Push 6 Service Layer 4 announce 1 create update 5 index 1 announce 7 read access

WebIDs Activity History Streams Feeds Data & Media announce Artefacts 1 Resources Feeds Data Layer

Fig. 1. Architecture of a Distributed Semantic Social Network (without protocol layer): (1) Resources announce services and feeds, feeds announce services – in particular a push service. (2) Applications initiate ping requests to spin the Linked Data Network. (3) Applications subscribe to feeds on push services and receive instant notifications on updates. (4) Update services are able to modify resources and feeds (e.g. on demand of an application). (5) Personal and global search services can index resources and are used by applications. (6) Access to resources and services can be delegated to applications by a WebIDs, i.e. the application can act in the name of the WebID owner. (7) The majority of all access operations is executed through standard web requests.

Apart from the main focus on representing user pro- parts of the profile data and can add or change some files, our architecture extends the WebID concept by of the profile information, e.g. create activities or cre- facilitating two additional tasks: service discovery and ate and link images. Applications on the DSSN are access delegation. also identified by using WebID profiles, but are not Service discovery is used to equip a WebID with re- described as a person but as an application. They can lations to trusted services which have to be used with act on behalf of a person but rely on delegated access that WebID. The usage of the WebID itself ensures rights for such an activity. This process is described in that an agent can trust this service in the same way as Section 2.3.1. she trusts the owner of the WebID. The most impor- tant service in our DSSN architecture is the Seman- Data Artefacts are resources on the Web which are tic Pingback service, which we describe in detail in published according to the Linked Data principles. Section 2.3.2. An example service relation is shown in Data artefacts includes posts, comments, taggings, ac- Listing 4. tivities and other Social Web artefacts which have been In addition to this service, we introduce access del- created by services and applications on the Web. Most egation for the WebID protocol. WebID access delega- of them are described by using specific web ontologies 6 tion is an enhancement to the current WebID authenti- such as SIOC [5], Common Tag or Activity Streams cation process in order to allow applications to access in RDF [10]. resources and services on behalf of the WebID owner, Media Artefacts are also created by services and ap- but without the need to introduce additional applica- plications but consist of two parts – a binary data part tion certificates in a WebID. We describe the WebID which needs to be decoded with a specific codec, and protocol as well as our access delegation extension in a -data part which describes this artefact. Usually, detail in Section 2.3.1. such artefacts are audio, video and image files, but of- Agents and Applications play an important role in fice document types are also frequently used on the today’s social networks5 . They have access to large Social Web. Media artefacts can be easily integrated into the DSSN by using the Semantic Pingback mech- 5Social network games such as FarmVille can have more than 80 million users (according to appdata.com), which constantly cre- ate activities in their social network. 6http://commontag.org/Specification 4 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

work. In our DSSN architecture, each activity is cre- 1 @prefix rdfs: . ated as a Linked Data resource (i.e. a DSSN data arte- 2 @prefix foaf: . facts), which links to the actor and object of the activ- 3 @prefix rel: . back Server in order to allow receiving reactions on 4 a foaf this activity (called pingbacks) and thus to spin a con- :Person; 5 rdfs:comment "This is my public profile network between these artefacts. only, more information available with History feeds FOAF+SSL"; are used to allow the syndication of 6 rel:worksWith , of a resource and many subscribers of the resource. 7 ; terms of added and deleted statements which are boxed 8 foaf:depiction ; in an Atom feed entry. A subscriber’s social network 9 foaf:firstName "Philipp"; foaf:surname " application can use this information to maintain an ex- Frischmuth"; act copy of the original resource for caching and query- 10 foaf:mbox ; for the syndication of changes of WebID profiles (e.g. 11 foaf:phone ; 12 foaf:workInfoHomepage . 2.3. Protocol Layer Listing 1: A minimal WebID profile with personal The protocol layer consists of the WebID identity information and two worksWith relations to other protocol and two networking protocols which pro- WebIDs. vide support for two complete different communica- tion schemes, namely resource linking and push noti- fication. anism, which is described in Section 2.3.2, and a link to a push-enabled . An example foto- 2.3.1. WebID (protocol) sharing application is described in Section 2.5. From a more technical perspective the WebID pro- tocol [17] (formerly known as a best practice [18]) 2.2.2. Feeds incorporates authentication and trust into the WebID Feeds are used to represent temporally ordered in- concept. The basic idea is to connect an SSL client formation in a machine-readable way but are usually certificate with a WebID profile in a secure manner not expressed as RDF. Feeds are widely used on the and thus allowing owners of a WebID to authenticate Web and play a crucial role in combination with the against 3rd-party websites with support for the WebID PubSubHubbub protocol to enable near real-time com- protocol. The WebID (i.e. a de-referencable URI) is, munication between different services. In the context therefore, embedded into an X.509 certificate by using of the DSSN architecture, two types of feeds are worth the Subject Alternative Name (SAN) extension. The considering: document, which is retrieved through the URI, con- Activity feeds describe the latest social network ac- tains the corresponding public key. Given that informa- tivities of a user in terms of an actor - verb - tion, a relying party can assert that the accessing user object triple where activity verbs are used as types owns a certain WebID. Furthermore, the WebID proto- of activities (e.g. to post, to share or to bookmark a col can provide access control functionality for social specific object)7. Activity feeds can be used to produce networks shaped by WebIDs in order to regulate access a merged view of the activities of one’s own social net- to certain information resources for different groups of contacts (e.g. as presented with dgFOAF [15]). An example of a WebID profile, which is annotated with 7Activity streams (http://activitystrea.ms) are Atom format extensions to describe activity feeds. It is extensible in a way a public key, is shown in Listing 2. This WebID pro- that allows publishers to use new verb or object type IRIs to identify file contains additionally a description of an RSA pub- site-specific activities. lic key (line 15), which is associated to the WebID by S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 5

user8. The relying party then fetches the WebID and @prefix rsa:. 13 checks for a statement as shown in Listing 39. 14 @prefix cert: . If such a statement is found and the relying party 15 [] a rsa:RSAPublicKey; trusts the accessing agent, access to the secured re- 16 rdfs:comment "used from my smartphone ..."; source is then granted. Given that the user is always 17 cert:identity ; bID, she stays in control with regard to the delegation 18 rsa:modulusc "C41199E ... 5AB5"^^cert:hex; 19 rsa:public_exponent "65537"^^cert:int. of access to other parties. In contrast to other access control solutions such as OAuth10, a user only has to Listing 2: An extension of the minimal WebID maintain a central resource, her WebID. from Listing 1: Description of an RSA public 2.3.2. Semantic Pingback key, which is associated to the WebID by using The purpose of Semantic Pingback [20] in the con- the cert:identity property from the W3C text of DSSN architecture is twofold: certificates and crypto ontology. – It is used to facilitate the first contact between two WebIDs and establish a new connection (Friend- ing). using the cert:identity property from the W3C – It is used to ping the owner of different social certificates and crypto ontology (line 19). network artefacts if there are activities related to Nevertheless the described approach requires the these artefacts (e.g. commenting on a blog post, user to access a secured resource directly, e.g. through tagging an image, sharing a website from the a web browser which is equipped with a WebID- owner). enabled certificate. However, in the scenario of a dis- tributed social network this arrangement is not always The Semantic Pingback approach is based on an ex- the case. If, for example, the software of user A needs tension of the well-known Pingback technology [9], to update its local cache of user B’s profile, it will do so which is one of the technological cornerstones of the by fetching the data in the background and not neces- overwhelming success of the blogosphere in the Social sarily when user A is connected. An obvious solution Web. The overall architecture is depicted in Figure 2. would be to hand out a WebID-enabled certificate to The Semantic Pingback mechanism enables bi- the software (agent), but then the user needs to create directional links between WebIDs, RDF resources as a dedicated certificate for all tools that have to access well as weblogs and websites in general (cf. Figure 1). secured information and simultaneously allows all par- It facilitates contact/author/user notifications in case a ticipating tools to "steal" her identity, which is not the link has been newly established. It is based on the ad- preferred solution from a security perspective . vertisement of a lightweight RPC service11, in the RDF To resolve this dilemma, we have extended the We- document, HTTP or HTML header of a certain Web bID protocol by adding support for access delegation. resource, which should be called as soon as a (typed By delegating access to an agent, a user allows a par- RDF) link to that resource is established. The Semantic ticular agent to deputy access-secured information re- Pingback mechanism enables people, but also authors sources. The agent itself authenticates against the re- of RDF content, a weblog entry or an article in gen- lying party by using it’s own credentials, e.g. by em- eral, to obtain immediate feedback, when other peo- ploying the WebID protocol, too. Additionally, it sends X-DeputyOf a HTTP header, which indicates that 8We have recently discussed the motivation and solution of this a resource is accessed on behalf of a certain WebID extension with a few members of the W3Cs WebID incubator group and hope that they will extend their specification regarding access delegation. 20 @prefix dssn: . 9The dssn:deputy relation and other related schema resources 21 dssn: introduced in this paper are part of the DSSN namespace, which is deputy . available at http://purl.org/net/dssn/. 10http://oauth.net/ Listing 3: Access delegation through the 11In fact, we experimented with different service endpoints, and dssn:deputy property. as a result we now prefer simple HTTP post requests which are not compatible with standard XML-RPC pingbacks. This is described in more detail in [19]. 6 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

Resource Layer Linking Resource Linked Resource (Source) (Target) (typed) linking 1

5 autodiscovery Publisher fetches 3

2 6 announces (updates) observes (notifies) 7 Pingback Client 4 Pingback Server (Link Propagator) RPC request RPC Layer

Link Publisher Link Reveiver

Fig. 2. Architecture of the Semantic Pingback approach: (1) A linking resource links to another (Data) Web resource, here called linked resource. (2) The Pingback client is either integrated into the data/content management system or realized as a separate service, which observes changes of the Web resource. (3) Once the establishing of a link has been noted, the Pingback client tries to auto-discover a Pingback server from the linked resource. (4) If the auto-discovery has been successful, the respective Pingback server is used for a ping. (5) In order to verify the retrieved request (and to obtain information about the type of the link in the semantic case), the Pingback server fetches (or de-references) the linking resource. (6 + 7) Subsequently, the Pingback server can perform a number of actions such as updating the linked resource (e.g. adding inverse links) or notifying the publisher of the linked resource (e.g. via email).

ple establish a reference to them or their work, thus the seamless connection and interlinking of resources facilitating social interactions. It also allows to pub- on the Social Web with resources on the DSSN. An lish backlinks automatically from the original WebID extension of our example profile with Semantic Ping- profile (or other content, e.g. status message) to com- back functionality making use of an external Semantic ments or references of the WebID (or other content) Pingback service is shown in Listing 4. elsewhere on the Web, thus facilitating timeliness and As requested by our third DSSN design paradigm coherence of the Social Web. (protocol minimalism), Semantic Pingback is a generic As a result, the distributed network of WebID pro- data networking protocol which allows to spin rela- files, RDF resources and social websites can be much tions between any two Social Web resources. In the tighter and timelier interlinked by using the Semantic context of the DSSN Architecture, Semantic Pingback Pingback mechanism than conventional websites, thus is used in particular for friending, commenting and rendering a network effect, which is one of the ma- tagging. jor success factors of the Social Web. Semantic Ping- Friending is the process of establishing a symmetric back is completely downwards compatible with the foaf:knows relation between two WebIDs. A rela- conventional Pingback implementations, thus allowing tionship is approved when both persons publish this re- lation in their WebIDs. A typical friending work flow can be described by the following steps: 22 @prefix ping: . – Alice publishes a foaf:knows relation to Bob 23 ping: in their WebID. to . – Alice’s WebID hosting service pings Bob’s We- Listing 4: Extension of the minimal WebID profile bID because of this new statement. from Listing 1: Assignment of an external Semantic – Bob receives a message from his Pingback Ser- Pingback service which can be used to ping this vice. specific resource.) – Bob approves of the relation by publishing it in his WebID, which sends again a ping back to Al- ice. S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 7

This basic model of communication can be applied personal search / index service of a subscribed user to different events and activities in the social network. (see next section) Table 1 lists some of the more important pingback Resource Synchronization is an additional commu- events. In any case, the owner of these resources can nication scheme based on feeds. It is required, espe- be informed about the event and in most cases specific cially in distributed social networks, to take into ac- actions should be triggered (refer to Section 3 for more count that relevant data is highly distributed over many details). locations, and access to and querying of this data can 2.3.3. PubSubHubbub be very time-consuming without caching. A properly PubSubHubbub is a web-hook-based publish/sub- connected resource synchronization tackles this prob- scribe protocol, as an extension to Atom and RSS, lem by allowing users to subscribe to changes of cer- which allows for near instance distribution of feed en- tain resources over PubSubHubbub. For a WebID, this tries from one publisher to many subscribers. Since process can be included into the friending process, feed entries are not described as RDF resources, Pub- while for other resources a user can subscribe man- SubHubbub is not the best solution as a transport pro- ually (e.g. for a group resource to receive updates of tocol for a DSSN from our perspective. However, Pub- its membership relations). Since resource synchroniza- SubHubbub with atom feeds is widely in use and has tion is not an issue which was raised in the context of good support in the web developer community which DSSN at first, we have built our data model as Linked is why we decided to use it in our architecture. Similar Data update logs [1] based on the work previously pub- 13 to Semantic Pingback, it is agnostic to its payload and lished by the Triplify project . can be used for all publish/subscribe communication connections. 2.4. Service Layer The main work flow of establishing a PubSubHub- bub connection is to advertise a hub service in an ex- Services are applications which are part of the isting feed. A subscriber follows this link and requests DSSN infrastructure (in contrast to applications from a subscription on this feed. If the feed changes, the the application layer). WebIDs can be equipped with feed publisher informs the hub service which instantly different services in order to allow manipulation and broadcasts the changes to all subscribers12. The main other actions on the user’s data by other applications. advantage of this communication model is to avoid As depicted in Figure 1, we have defined four essential frequent and unnecessary pulls of all interested sub- services for a DSSN. scribers from this feed and to allow a faster broadcast Ping Service to the subscriber. The ping service provides an endpoint for any in- In the DSSN architecture, two specific feeds are im- coming pingback request for the resources of a user. portant and interlinked with a WebID to allow sub- First and foremost, it is used with the WebID for scriptions to them: activity feeds which are used for ac- friending but also for comment notification and discus- tivity distribution and change set feeds which are used sions. for resource synchronization. One application instance can provide its services for multiple resources. In a minimal setup, a ping ser- Activity Distribution is a fundamental communica- vice provides only a notification service via email. In tion channel for any social network. A personal activ- a more complex setup, the ping service has access to ity feed publishes the stream of all activities on social the update service of a user (via access delegation) and network resources (artefacts and WebIDs) with a spe- can do more than notification. cific user as the actor. These activities can and should A ping service can be announced with the ping:to be created by any application which is allowed to up- relation as shown in Listing 4. date the feed (ref. access delegation). In addition, ac- tivity feeds can be created for data and media arte- Push Service facts in order to allow object-centered push notifica- The push service is used for activity distribution tion. Typically, activity feed updates are pushed to a and resource synchronization. Both introduced types of feeds announce its push service in the same way

12During the subscription a callback endpoint is supplied by the subscribing endpoint, which is later used for pushing the data. 13http://triplify.org 8 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

Table 1 Pingback activities distinguished by types of resources and relations.

source resource relation target resource description

WebID (foaf:Person) foaf:knows WebID (foaf:Person) friending sioct:Comment sioc:about foaf:Image commenting an image (or any other resource) sioc:Post sioc:reply_of sioc:Post replying to a (friends) post ctag:Tag ctag: * any resource is tagged by a user aair:Activity aair:activityObject * any resource is object of an activity as using the rel="hub" link in the feed head. Since 2. In addition to public search services, we want to both types of feeds are valid atom feeds, a DSSN push emphasize the importance of private search ser- service can be a standard PubSubHubbub-based in- vices in our architecture. Private search services stance. are used in order to have a fast resource cache not To equip social network resources with its cor- only for public, but also for private data which a responding activity and change set feeds, we have user is allowed to access. defined two OWL object properties which are sub- A private search service is used for all users and properties of the more generic sioc:feed relation application queries from applications which act from the SIOC project [5]: dssn:activityFeed on behalf of the user. The underlying resource and dssn:syncFeed. index of a private search service is used as a In addition to these RDF properties, DSSN agents callback for all push notifications from feeds to should pay attention to the corresponding HTTP which the user has subscribed. That is, she is able header fields X-ActivityFeed and X-SyncFeed, to query over the latest up-to-date data by using which are alternative representations of the OWL ob- her private search service. In addition, she can ject properties to allow the integration of media arte- query for data which has never been public and facts without too much effort. is published for a few people only. Search and Index Service Since private search services are used by applica- Search and index services are used in two different tions which act on behalf of the user, they must be contexts in the DSSN architecture. WebID-protocol-enabled. That is, they accept requests from the user and her delegated agents only. In addi- 1. They are used to search for public web resources, tion, applications need to know which private search which are not yet part of a user’s social network. service should be accessed on behalf of the user. This These search services are well-known semantic mode is again made possible by providing a link from search engines as Swoogle [7] or Sindice [22]. the WebID to the search service15 They use crawlers to keep their resource cache In our architecture we assume that search services up-to-date and provide user interfaces as well as accept SPARQL queries. However, this assumption is application programming interfaces to integrate not true for all public Semantic Web search engine at and use their services in applications. the moment. In our architecture, these public services are used to search for new contacts as well as other arte- Update Service facts in the same way as people can use a stan- Finally, an update service provides an interface to dard web search engine. The main advantage in modify and create user resources in terms of SPARQL using Semantic Web search engines lies in their update queries. In the same way as private search ser- ability to use graph patterns for a search14.

15In our prototypes we use a simple OWL object prop- 14A motivating example in our context is the search for erty dssn:searchService, which is a sub-property of resources of type foaf:Person, which are related to the dssn:trustedService. We assume that such an easy vocabu- DBpedia topic dbpedia:DataPortability (e.g. with the lary is only the first step to a fully featured service auto-discovery foaf:interest object property). ontology and consider all dssn terms as unstable. S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 9 vices, update services are secured by way of the We- plex social network application is described in the next bID protocol and accept requests only by the user itself section. and by agents in access delegation mode16. Typical examples of how to use this service are the creation of activities on behalf of the user or the mod- 3. Implementation ification of the user’s WebID, e.g. by adding a new foaf:knows relation. We provide a prototypical implementation of our ap- In the next section, we give a detailed description of proach which utilizes the OntoWiki application frame- the service interplay and usage by applications. work [2] which is also the basis for the evaluation in Section 4. All features were realized by employing the 2.5. Application Layer extension mechanisms offered by OntoWiki. The prototype described here, can be summarized as Social Web applications create and modify all kinds a WebID provider with an integrated communication of resources for a user. In our architecture, they have to hub. The following features are implemented so far: use the trusted services which are related to a WebID – Users can create and manage their WebID profiles instead of their own. Since access to these services is and any other Linked Data-enabled resource. exclusively delegated by the user to an application, the – Users can make friends, subscribe to their activ- 17 user has full control over her data . To illustrate how ities and profile updates and receive changes in- DSSN applications work with the WebID and its ser- stantly on change. vices, we will describe a simple foto-sharing applica- – Users can search and browse for friends and ac- tion: tivities inside her social network as well as fil- When a user creates an account on this service, she ter these resources by facets based on object and uses her WebID for the first login and delegates ac- datatype properties. cess to this application. The application analyses the – Users can comment on and subscribe to any WebID and discovers the trusted services and some DSSN resource which is equipped with a Se- meta-data of the user (e.g. name, short bio and depic- mantic Pingback service or an PubSubHubbub- tion). The user uploads her first image to the appli- enabled activity stream. cation, which, as a result, creates a new activity with – Users receive notifications if someone comments the user’s update service18. Furthermore, it equips the or links her WebID and send a pingback notifica- newly uploaded image with the pingback service of the tion. user; thus enabling the image for backlinks and com- In the next subsections we describe the implementa- ments. New comments can arrive from everywhere on tion decisions made to allow these functionality. An the web, but the application also provides its own com- commented screenshot of the central activity stream menting service (integrated in the image web view). If interface can be seen on Figure 3. another user writes a comment on this image, a data artefact is created in the namespace of the application 3.1. Creating and Updating Data Artefacts and a ping request is sent to the user’s pingback service (since this service is related to the image). Creating and managing Linked Data enabled RDF This simple example demonstrates the interplay and resources can be achieved without modifying the On- rules of the DSSN service architecture. A more com- toWiki basic functionality. We employ the RDFau- thor [21] JavaScript widget library in order to auto- 16We defined dssn:updateService as a relation between a matically create forms out of RDFa annotated HTML WebID and an update service documents19. Without any user effort, a newly created 17At the moment we distinguish only between access and no ac- cess to a service. As an extension, we can imagine that a private resource will be Linked data enabled if it shares the search service can handle access on parts of the private Social Graph namespace of the OntoWiki installation. In addition differently (an online game does not need to know which other ac- to that, we added support for WebID authentication tivities you pursue on the web). Access policies for RDF knowledge as well as for certificate creation by implementing a bases is a topic of ongoing research and we hope that the results of this research area can be adapted here. 18This activity is instantly pushed to all of the user’s friends but is 19Each output page which is created by OntoWiki is RDFa en- not part of the work of the image publishing service. hanced. 10 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

2 1 3

6

4 5

Fig. 3. Screenshot of the OntoWiki DSSN activity stream view. The interface elements are: (1) The Share it! activity creation module where users can post status notes and share links and media artefacts. (2) The main interface view tab to switch between configuration screen, profile manager, friending interface and the activity stream. (3) The activity filter and search module, to allow a facet based browsing of activities. (4) The events module, which queries the cache social network data for birthdays and other events. (5) The activity stream, which is the result of the SPARQL query modified by the filter module. (6) The generic wiki interfaces to create any type of Linked Data resource (e.g. comments). dedicated WebID extension. Since there is quite some tion is stored in a separate graph, a synchroniza- cryptographic processes involved, this extension needs tion is trivial. some prior configuration steps, e.g. the OntoWiki in- 4. Finally, we subscribe to two feeds when they are stance needs to be configured as a SSL/TLS enabled published within the users profile: (1) The users web application. activity feed, which we use to create the news feed for the user. The incoming atom activity en- 3.2. Maintaining Network Connections try were transformed to AAIR resources and im- ported to an additional activity RDF graph. (2) In order to maintain network connections (aka. The history feed for the friends profile. This feed friending) we execute the following steps in our imple- is employed in order to keep the cached WebID mentation: profile data up-to-date.

1. When a user enters a WebID inside the friending 3.3. Ignoring Activities and WebIDs module, we add a new statement (foaf:knows) into the RDF graph containing the users profile Since a user is normally not interested in all activi- data. This automatically generates a Pingback re- ties of his friends (e.g. certain games), there needs to quest, so the new friend will be notified. be a feasibility to remove certain activities from the 2. We create a new RDF graph, which will act as a timeline. We therefore employ EvoPat [13], a pattern- cache for the WebID profile data about that par- based approach for the evolution and refactoring of ticular friend. We also configure this new Knowl- RDF knowledge bases. EvoPat is integrated also as edge Base in such a way, that it is imported into an OntoWiki extension and in conjunction with our the users graph automatically. DSSN implementation we use this component to keep 3. We fetch the data from the friends WebID by users timeline clean. employing the Linked Data principles. We cache Each time a user selects a Hide all activities ... but- that data in the newly generated graph in order to ton next to an activity, we create a new pattern, which enhance the performance. Since those informa- subsequently matches such statements in the graph and S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 11 removes them. Possible hide patterns are generated 4.1. Social Web Acid Test from the activity data itself, for instance ... from this user, ... of this language or ... about this object. Once The Social Web Acid Test (SWAT) is an integration new data for a given user is added, we re-apply the pat- use case test conceived by the Federated Social Web tern. In this way it also applies for future changes of Incubator Group of the W3C. Currently, only the first the knowledge base. and very basic level of the test (SWAT020) has been de- veloped and described completely. Nevertheless, those 3.4. Generating and Distributing Activities parts of the next level (SWAT1) which are currently published are discussed here too. A user is able to generate different kinds of activ- ities in our implementation. In the current state we SWAT0: The objectives of the first SWAT level have support three types of activities: status updates, photo clearly been described in the following use case21: sharing as well as link recommendations. We imple- mented a small extension, which displays a Share It! module inside the OntoWiki user interface. As a re- 1 User A takes a photo of user B from her phone and posts it. sult the user can quickly access this functionality and 2 User A explicitly tags the photo with user B. sharing is made really easy. Once the user generates 3 User B gets notified that she is in a photo. an activity, her activity feed is updated and subscribers 4 User C who follows user A gets the photo. to that feed are delivered with the new content by the 5 User C leaves a comment on the photo. push service. 6 User A and user B get notified about the comment. 3.5. Pingback Integration Listing 5: Social Web Acid Test - Level 0 All activities are represented as Linked Data re- sources and are equipped with a Pingback service as well as a feed. Thus, users can comment on any re- Utilizing all technologies described before, our source in their own social application and addition- DSSN architecture passes the SWAT0 without any ally subscribe to changes to that resource (e.g. com- problems. The following enumeration describes the ments by others). Each time someone comments on a details: resource (or otherwise uses it) externally, a Pingback request is send to the owning OntoWiki instance. Con- 1. User A takes a photo of user B and uploads sequently the publisher gets notified and is able to react it (e.g. with the example application from Sec- again on that new activity, which facilitates conversa- tion 2.5): The web space returns a link to the tions in a distributed manner. If a user is subscribed to user’s pingback server in the HTTP header of the a resource activity feed (which is automatically done, uploaded image. once he comments on a particular resource), she gets 2. User A explicitly tags the photo with user B: This notified about other comments, even if she is not the is done by creating a tag resource which links currently commenting person or the owner of the re- both to the image and to the WebID of user B. source. A pingback client sends a ping request to all of these resources after publishing the tag on the web. 4. Evaluation 3. User B is notified that she is on a photo: The We divided our evaluation process into two inde- notification is created by the pingback service of pendent parts: Firstly, the qualitative evaluation part User B who has received a request from the tag- aims to prove the functionality of the DSSN architec- ging application which was used by User A. ture by assessing use cases from the social web acid test (Section 4.1). Secondly, the quantitative evalua- 20http://www.w3.org/2005/Incubator/ tion part aims to prove the real-world usefulness of federatedsocialweb/wiki/SWAT0 21 our prototypical implementation by testing the perfor- For this use case the following assumptions are made: (1) Users employ at least two (ideally, three) different services each of which mance and distribution of data in the social network. is built with a different code base. (2) Users only need to have one This evaluation is carried out by using a social network account on the specific service of their choice. (3) Ideally, partici- simulation approach (Section 4.2). pants A, B, and C use their own sites (personal URLs). 12 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

4. User C, who follows user A, receives the photo: The workflow of this evaluation framework is de- User C is instantly provided with an update in picted in Figure 4. It has a pipeline architecture which her activity stream, informing her about the new is built by using three additional tools that help us to image. generate and manage the replay agent test data. In a 5. User C leaves a comment on the photo: This is first step, we use GRR (Generating Random RDF [4]) done in the same way as publishing the Tag. to generate the users’ profile and activities data. We 6. User A and user B are notified about the com- enhanced GRR by adding a post-generation and data ment: User A will be notified because her ping- adjustment component to it23. The generated data is back service informs her about this ping. User B loaded into the profile distributor which splits the data will be notified only if she has subscribed to the into separate user profiles, each with personal data and activity feed of the photo provided that it exists. activities. After splitting the data into single parts, the profile distributor passes each part on to a correspond- 22 SWAT1 is currently not finally defined , so an eval- ing replay agent. Upon call the replay agent instanti- uation can be a rough sketch only. The next SWAT ates a new OntoWiki-based DSSN node with the given level will require a few different use cases which intro- user profile. After the user profile has been loaded, and duce some new Social Web concepts. However, most because all user activities have timestamps, actions of of the user are satisfied already as a conse- users are simulated over time by using the received ac- quence of the fully distributed nature of the DSSN ar- tivity feed data. chitecture (e.g. data portability and social discovery). In order to generate the data as much as possible The more interesting user stories are: (1) The Private similar to real data, we decided to achieve an analysis content and Groups use cases will require a distributed of social network activities from our friends and col- ACL management. Some ideas on using WebIDs for leagues, and use the outcome of this analysis to gener- group ACL management are already published with ate artificial data. dgFOAF [15] and we think that this is a good starting point for further research. (2) The Social News use case 4.3. Quantitative Analysis of Social Network introduces a new vote activity. Since our architecture Activities applies schema agnostic social network protocols, this new type of activity can be communicated as any other The original raw input data for this analysis was activity. gathered from the Facebook accounts of several AKSW research group members. We created a desktop appli- 4.2. Evaluation Framework Architecture cation which utilizes the Facebook desktop API to ex- tract the data of one’s account. After a user logged into Our next aim was to evaluate the prototypical imple- this application by using her credentials, the applica- mentation. We decided to use a simulation approach tion extracted all of the user’s activities, friends and all in which activities are created artificially on the social activities from friends that were accessible by using network rather than arranging a user evaluation, which the API. The extracted data was fetched in JSON and strongly depends on the graphical user interface. In a later parsed and converted into a MySQL tables to sim- first step, we added an additional service for remote plify its analysis. The extracted data contains 6053 in- procedure calls to the OntoWiki DSSN node as well as dividual users. In a given time frame of a month, these an execution client (replay agent) which can execute users have generated 13682 different activities, each of activities remotely controlled and based on input activ- which was tagged with one of the following four pos- ity data in RDF. Then we designed and implemented sible types: status, link, video, music and photo. The an evaluation framework which is able to create ac- statistical analysis was performed upon the extracted tivity data randomly, but based on statistical parame- data to create a prototype of an average user. ters, and serve it to multiple instances of the OntoWiki Testing each type of activity for statistical distribu- DSSN node which use the data to simulate the social tion, showed that all types of activities follow normal network. distribution but with different values of mean and stan- dard deviation. Table 2 illustrates that normal distribu- tion parameters belong to each type of activity. 22Available online at http://www.w3.org/2005/ Incubator/federatedsocialweb/wiki/SWAT1_use_ cases (receive 29.07.2011). 23GRR+ is available at https://github.com/AKSW/GRR. S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 13

1 2 3 4

GRR+ Generator Profile Data Network and activity config Activities Profile Distributor Replay Agents

Fig. 4. Workflow of the social network evaluation framework: (A) Based on a given qualitative and quantitative configuration of profiles and activities, a random knowledge base is generated. (B) The knowledge base is split up into a part for every single profile and distributed to a list of server machines. (C) For each profile, a replay agent is initialized which will create and update a specific profile as well as activities on a single social network node. (D) The social network node (in our case an OntoWiki instance) connects to other nodes as well as creates activities for the social network in the same way as it would be achieved under control of a human user.

Table 2 4.4. Data Generation and Testbed Configuration Distribution parameters as well as counts and percentages

Type Mean SD Count Percentage The data generation was split into one part for each user group, each with its own characteristics and tar- Status 215.3225 75.5157 6675 49% Link 114.1290 43.5291 3538 26% get numbers of resources. Table 4 lists some quantita- Photo 55.1935 18.8262 1711 13% tive characteristics of the generated data in the form of Video 53.9354 19.9063 1672 12% a SPARQL graph pattern. As can be noticed, the data Music 0.1612 0.4543 5 0% generated for testing represents a perfect value distri- bution based on the statistical analysis.

The analysis of the frequency of users’ activities per Table 4 month showed that the range of the number of activi- Quantitative characteristics of the generated social network data in ties per user is between 1 and 24 and the average num- the form of distinct SPARQL query counts ber is 2. We categorized users in terms of the frequency Graph pattern Distinct count of their activities into 3 groups which is shown in ta- ble 3. As can been seen, a high percentage of users are ?s a foaf:Person 100 in the inactive user category and a few users have a ?s a :Activity 1000 high number of posted activities. ?s :activityVerb :Post 500 ?s :activityVerb :Bookmark 200 Table 3 ?s :activityVerb :MediaContent 300 Different categories of users based on the frequency of their ?s foaf:knows ?o 1400 activities. ?s foaf:member ?o 100

Category Activities / Month Percentage

High 17-24 0.792 Medium 9-16 5.104 As input rules for data generation, GRR takes three Low 1-8 94.102 files with specific parameters: – A generator file that describes the rules for creat- We used this statistics to generate artificial data at ing the instances of classes a large scale. First, two pools of data, i.e. activity and – A sampler function input file that determines user pool, were generated. An activity pool was gener- URIs for classes as well as values for specific ated with respect to both proportion and distribution of predicates each type of activity, and the user pool is based on con- – A type-property mapping input file that maps the sidering the percentage of users in each category. At properties, which are described in the sampler last, activities were assigned to users through a random function input, to specific instances of different process. classes 14 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

Listing 6 shows an example generator file use for @prefix : . creating the instances. Lines 1-2 define namespaces 1 2 @prefix foaf: . used in the generated data - foaf for user profiles and 3 24 aair for activities . Lines 4-5 are used to generate 4 :SharonSantana a foaf:Person; 50 (in this example) foaf:Person as well as five 5 foaf:birthday "07-02"; foaf:Group instances. Finally, lines 7-10 define the 6 foaf:depiction ; rules to interlink person with group instances and cre- 7 foaf:firstName "Sharon"; ate several status note activities for each person. 8 foaf:knows :ChithiraJoy, :KikeGretzschel, As an additional feature of the data creation, we 9 :LadanPansare; reused the extracted given and family names from our 10 foaf:mbox ; data dump and created random names by combining 11 foaf:mbox_sha1sum "759...cc8"; 12 foaf:member :Organization4; them. A generated WebID profile is depicted in List- 13 foaf:surname "Santana". ing 7. Based on the person name, we also generated the mbox, mbox_sha1sum and depiction properties Listing 7: Generated example profile with object and from FOAF as well as the resource URI of the person. datatype properties from the FOAF vocabulary. The overall output file of the GRR generation is split into separate files each of which represents a single person and her activities and each of which is prepared backend for our experiments so each node got its own to be transferred to the replay agent. table space in a single MySQL server instance. The final run of the replay agents helped us to fix 4.5. Results and Discussion multiple issues with the communication between the nodes and details in the source code. With the de- We generated 100 stand-alone OntoWiki DSSN scribed setup, we did not reach a limit regarding the nodes on two different machines, each with its own scalability on a single machine. We did not expected separated dataspace. Since we expected that a single this outcome since the overall activity rate of each node does not need to handle too much RDF data, node was less than one activity per day, even in the we used the MySQL backend rather than the Virtuoso highest active user group. However, we think that the amount of managed triple per DSSN node would grow 24The Atom Activity Streams in RDF Vocabulary (AAIR) are an in a long-term experiment or real-live usage to a level outcome of the NoTube project available at http://notube.tv which is problematic to handle by a low-cost triple store such as the OntoWiki MySQL backend. To make a guess on incoming triple per year, we can assume namespace foaf: 1 that one activity is represented on average with seven 2 namespace aair: 3 triples and a single node can easily get 50 activities 4 create 50 { foaf:Person } per day. With these parameters, a single DSSN node 5 create 5 { foaf:Group } has a raw triple income of over 100k triple per year 6 only from incoming activities. However, we assume 7 for each { foaf:Person ?p1 } for 5-20 { foaf: Person ?p2 } where { FILTER( ?p1 != ?p2 ) that low-cost triple stores will advance over time in or- } connect { ?p1 foaf:knows ?p2 } der to allow the integration of semantic technologies in 8 for each { foaf:Person ?p1 } for 1-1 { foaf: today’s everyday-use software. Group ?g1 } connect { ?p1 foaf:member ?g1 } 9 for each { foaf:Person ?p1 } create 1-1 { aair:Activity ?a1 } connect { ?a1 aair: 5. Related Work activityActor ?p1 } 10 for each { aair:Activity ?a1 } create 1-1 { Various architectures for achieving a federated so- aair:Note ?n1 } connect { ?a1 aair: cial network were proposed and numerous projects activityObject ?n1 } based on those were developed with respect to a conve- Listing 6: Example of Generator file for GRR. nient and secure way for users. With the advent of the Semantic Web, research in this field was adapted for taking advantages of machine-readable data and on- S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 15 tologies. We can roughly divide the related work into ging, instant messaging or pingback for Semantic Web. distributed social networks on Web 2.0 and distributed Thus, these adapted technologies can form the basis social networks on Semantic Web. for a new generation of social networks. SMOB is a semantic and distributed framework in- Distributed social networks on Web 2.0. There are troduced in [12]. It presents some main requirements many different views for tackling distributed social for using microblogging at a large scale, i.e. machine- network challenges on the way to a federation in the readable meta-data, decentralized architecture, open Web 2.0. A well-known view is network of networks, data as well as reusability and interlinking of data. which employs existing protocols and standards for These challenges have been addressed in SMOB by us- providing foundations based on which networks can ing RDF(a)/OWL data, distributed Hubs for exchang- easily communicate with each other. Rss, PubSubHub- ing information and a sync protocol (based on SPAR- bub, Webfinger, ActivityStreams, Salmon and OStatus QL/Update over HTTP) and interlinking components. are the common protocols used for this purpose. Some sparqlPuSH [11] utilizes the PubSubHubbub protocol of the projects developed using these technologies pro- to broadcast RDF query result updates. In this project, tocols are: StatusNet25, DiSo26, Elgg27, GNU Social28, a SPARQL query which is associated with the user is The Mine Project29, Appleseed30, OneSocialWeb31, at first created and monitored in an RDF triple store. BuddyPress32, Cliqset33, Gowalla34, Posterous35. Another view is using mashups. Mashups are web The registered user is notified whenever the result set applications that combine data from more than one ser- changes. vice provider to create new services. An example is the Beyond these activities, an infrastructure is neces- buddycloud project36 running an inbox server as a cen- sary to which these technologies can be employed. Our tralized manager over federated social networks. This current work is a pioneer effort in integrating current server aggregates all the posts and updates from so- technologies into a coherent architecture to form a dis- cial networks to which a user has subscribed. Here, the tributed social network in a semantic-based context. XMPP protocol [14] is used for messaging and feder- ation. Danube37 is a user-centric architecture in which in- 6. Conclusions and Future Work dividuals are responsible for personal data and rela- tions. It allows individuals to manage their relation- In this article we described our reference architec- ships with each other and with vendors. Another analo- ture and proof-of-concept implementation of a dis- gous view has been proposed by PrPl [16] as a person- tributed social network based on semantic technolo- centric, social-networking infrastructure, where a per- gies. Compared with the currently prevalent central- son’s data is logically collected in one place, and ized social networks this approach has a number of ad- social-networking applications can be executed in a vantages: distributed manner without a central service. – Privacy. Users of the DSSN can setup their own Distributed Social Networks on Semantic Web. Pri- DSSN node or chose a DSSN node provider with marily, attempts have been concentrated on the confor- particularly strict privacy rules in order to ensure mation of traditional technologies such as microblog- a maximum of privacy. – Data security. Due to the distributed nature it is

25http://status.net/ more difficult to steal large amounts of private 26http://diso-project.org/ data. Also, security is ensured through public re- 27http://elgg.org/ view and testing of open-standards and to a lesser 28http://foocorp.org/projects/social/ extend through obscurity due to closed propri- 29 http://themineproject.org/ etary implementations. 30http://opensource.appleseedproject.org/ 31http://onesocialweb.org/ – Data ownership. Users have full ownership and 32http://buddypress.org/ control over the use of their data. They are not re- 33http://cliqset.com/ stricted to ownership regulations imposed by their 34 http://gowalla.com/ social network provider. Instead DSSN users can 35https://posterous.com/ 36http://buddycloud.com/ implement fine-grained data licensing options ac- 37http://projectdanube.org cording to their needs. 16 S.Tramp et al. / An Architecture of a Distributed Semantic Social Network

– Extensibility. The representation of social net- data publication from relational databases. In Juan Quemada, work resources like WebIDs and data artefacts is Gonzalo León, Yoëlle S. Maarek, and Wolfgang Nejdl, editors, not limited to a specific schema and can grow Proceedings of the 18th International Conference on World 38 Wide Web, WWW 2009, Madrid, Spain, April 20-24, 2009, with the needs of the users . pages 621–630. ACM, 2009. – Reliability. Again due to the distributedness the [2] Sören Auer, Sebastian Dietzold, and Thomas Riechert. On- DSSN is much less endangered of breakdowns or toWiki - A Tool for Social, Semantic Collaboration. In Isabel F. cyberterrorism, such as denial-of-service attacks. Cruz, Stefan Decker, Dean Allemang, Chris Preist, Daniel – Freedom of communication. As we observed re- Schwabe, Peter Mika, Michael Uschold, and Lora Aroyo, edi- tors, The Semantic Web - ISWC 2006, 5th International Seman- cently during the Arab Spring social networks can tic Web Conference, ISWC 2006, Athens, GA, USA, Novem- play a crucial role in attaining and defeating civil ber 5-9, 2006, Proceedings, volume 4273 of Lecture Notes in liberties. A DSSN with a vast amount of nodes is Computer Science, pages 736–749, Berlin / Heidelberg, 2006. much less endangered of censorship as compared Springer. to centralized social networks. [3] Tim Berners-Lee. Linked Data - Design Issues. website. last change: 2009/06/18; retrieved: 2011/07/25. However, the work presented in this article can only [4] Daniel Blum and Sara Cohen. Grr: Generating Random RDF. be a first step of a larger research and development In Grigoris Antoniou, Marko Grobelnik, Elena Paslaru Bontas Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leen- agenda aiming at realizing a truly distributed social heer, and Jeff Z. Pan, editors, The Semanic Web: Research and network based on semantic technologies. Our imple- Applications - 8th Extended Semantic Web Conference, ESWC mentation, for example, based on the OntoWiki tech- 2011, Heraklion, Crete, Greece, May 29 - June 2, 2011, Pro- nology platform is currently only a proof-of-concept ceedings, Part II, volume 6644 of Lecture Notes in Computer implementation. For a widespread use, the usability, Science, pages 16–30. Springer, 2011. [5] John G. Breslin, Stefan Decker, Andreas Harth, and Uldis Bo- scalability and the multi-client capabilities have to be jars. SIOC: an approach to connect web-based communities. improved. Also, the distributed realization of social International Journal of Web Based Communities, 2(2):133– networking apps as implemented in centralized social 142, 2006. networks through APIs (e.g. Open Social) has to be [6] D. Brickley and L. Miller. FOAF Vocabulary Specifica- investigated. We also expect, that the combination of tion. Namespace Document 2 Sept 2004, FOAF Project, 2004. http://xmlns.com/foaf/0.1/. standards and protocols as described in this article will [7] Li Ding, Timothy W. Finin, Anupam Joshi, Rong Pan, R. Scott be implemented in a number of additional platforms Cost, Yun Peng, Pavan Reddivari, Vishal Doshi, and Joel (e.g. Wordpress, Elgg, Diaspora) thus making them Sachs. Swoogle: a search and metadata engine for the semantic first-class nodes of the DSSN. web. In David A. Grossman, Luis Gravano, ChengXiang Zhai, Otthein Herzog, and David A. Evans, editors, Proceedings of the 2004 ACM CIKM International Conference on Information and Knowledge Management, Washington, DC, USA, Novem- Acknowledgments ber 8-13, 2004, pages 652–659. ACM, 2004. [8] Maxwell Krohn, Alex Yip, Micah Brodsky, Robert Morris, and We would like to thank our colleagues from AKSW Michael Walfish. A World Wide Web Without Walls. In 6th research group (in particular Jonas Brekle and Nadine ACM Workshop on Hot Topics in Networking (Hotnets), At- lanta, GA, November 2007. Jänicke) for their helpful comments and inspiring dis- [9] S. Langridge and I. Hickson. Pingback 1.0. Tech- cussions during the development of this approach. This nical report, http://hixie.ch/specs/pingback/ work was partially supported by a grant from the Euro- pingback, 2002. pean Union’s 7th Framework Programme provided for [10] Michele Minno and Davide Palmisano. Atom Activity Streams the project LOD2 (GA no. 257943). RDF mapping. NoTube Project, 2010. http://xmlns. notu.be/aair/. [11] A. Passant and P. Mendes. sparqlPuSH: Proactive notifica- tion of data updates in RDF stores using PubSubHubbub. In References SFSW2010, 2010. [12] Alexandre Passant, John G. Breslin, and Stefan Decker. Re- [1] Sören Auer, Sebastian Dietzold, Jens Lehmann, Sebastian thinking Microblogging: Open, Distributed, Semantic. In Hellmann, and David Aumueller. Triplify: Light-weight linked Boualem Benatallah, Fabio Casati, Gerti Kappel, and Gustavo Rossi, editors, Web Engineering, 10th International Confer- ence, ICWE 2010, Vienna, Austria, July 5-9, 2010. Proceed- 38We already experimented with activities like git commits and ings, volume 6189 of Lecture Notes in Computer Science, comments on lines of source code – both usecases integrate very well pages 263–277. Springer, 2010. because of the schema-agnostic transport protocols and the Linked [13] Christoph Rieß, Norman Heino, Sebastian Tramp, and Sören Data paradigm. Auer. EvoPat – Pattern-Based Evolution and Refactoring of S.Tramp et al. / An Architecture of a Distributed Semantic Social Network 17

RDF Knowledge Bases. In Proceedings of the 9th Interna- back. In P. Cimiano and H.S. Pinto, editors, Proceedings of the tional Semantic Web Conference (ISWC2010), Lecture Notes EKAW 2010 - Knowledge Engineering and Knowledge Man- in Computer Science, Berlin / Heidelberg, 2010. Springer. agement by the Masses; 11th October-15th October 2010 - Lis- [14] P. Saint-Andre. Extensible Messaging and Presence Protocol bon, Portugal, volume 6317 of Lecture Notes in Artificial Intel- (XMPP): Core. Request for Comments 3920, IETF, October ligence (LNAI), pages 135–149, Berlin / Heidelberg, October 2004. http://www.ietf.org/rfc/rfc3920.txt. 2010. Springer. [15] F. Schwagereit, A. Scherp, and S. Staab. Representing Dis- [21] Sebastian Tramp, Norman Heino, Sören Auer, and Philipp tributed Groups with dgFOAF. In ESWC2010, pages 181–195, Frischmuth. RDFauthor: Employing RDFa for collaborative June 2010. Knowledge Engineering. In P. Cimiano and H.S. Pinto, edi- [16] Seok-Won Seong, Jiwon Seo, Matthew Nasielski, Debangsu tors, Proceedings of the EKAW 2010 - Knowledge Engineer- Sengupta, Sudheendra Hangal, Seng Keat Teh, Ruven Chu, ing and Knowledge Management by the Masses; 11th October- Ben Dodson, and Monica S. Lam. PrPl: a decentralized so- 15th October 2010 - Lisbon, Portugal, volume 6317 of Lecture cial networking infrastructure. In Proceedings of the 1st ACM Notes in Artificial Intelligence (LNAI), pages 90–104, Berlin / Workshop on Mobile Cloud Computing & Services: Social Heidelberg, October 2010. Springer. Networks and Beyond, MCS ’10, pages 8:1–8:8, New York, [22] Giovanni Tummarello, Renaud Delbru, and Eyal Oren. NY, USA, 2010. ACM. Sindice.com: Weaving the Open Linked Data. In Karl Aberer, [17] M. Sporny, S. Corlosquet, T. Inkster, H. Story, B. Harbu- Key-Sun Choi, Natasha Fridman Noy, Dean Allemang, Kyung- lot, and R. Bachmann-GmÃijr. WebID 1.0: Web identifica- Il Lee, Lyndon J. B. Nixon, Jennifer Golbeck, Peter Mika, tion and Discovery. Unofficial draft, August 2010. http: Diana Maynard, Riichiro Mizoguchi, Guus Schreiber, and //payswarm.com/webid/. Philippe Cudré-Mauroux, editors, The Semantic Web, 6th In- [18] H. Story, B. Harbulot, I. Jacobi, and M. Jones. FOAF+SSL: ternational Semantic Web Conference, 2nd Asian Semantic RESTful Authentication for the Social W. In SPOT2009, 2009. Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, [19] Henry Story, Andrei Sambra, and Sebastian Tramp. Friending November 11-15, 2007, volume 4825 of Lecture Notes in Com- On The Social Web. In Federated Social Web Europe 2011, puter Science, pages 552–565. Springer, 2007. Berlin June 3rd-5th 2011, 2011. [20] Sebastian Tramp, Philipp Frischmuth, Timofey Ermilov, and Sören Auer. Weaving a Social Data Web with Semantic Ping-