A Peer-to-peer XML Database

Thomas Risse, Predrag Knezevic Fraunhofer IPSI Dolivostrasse 15 64293 Darmstadt Germany {risse, knezevic}@ipsi.fhg.de

© Fraunhofer IPSI 1

Introduction

4Short history of the Internet and P2P systems • DNS, Usenet • Moving to /server model 4Trends • Moore’s law in the real life • More freedom from the infrastructure

© Fraunhofer IPSI 2

Seite 1

1 Current Status

Systems 4Content sharing • , , , eDonkey,OverNet 4Content storing • OceanStore, , Past, GNUNet Drawbacks 4Working on the file level 4File not writable after storing 4Poor searching capabilities

© Fraunhofer IPSI 3

Motivation

A P2P database could be useful as 4A basis for P2P applications • Resource sharing • Data sharing 4An ad-hoc storage in business processes and workflows 4A distributed discovery service 4A storage for sensor and ad-hoc networks 4A distributed index for the Web 4A service for Grid computing

© Fraunhofer IPSI 4

Seite 2

2 Example 1: Content Sharing

KaZaA, Gnutella or Sharing of metadata about songs similar… (author, title, location,…)

© Fraunhofer IPSI 5

Example 2: Ad-hoc Storage for Web Services

© Fraunhofer IPSI 6

Seite 3

3 Example 4: Sensor Networks

Measuring current, gas, water spending

© Fraunhofer IPSI 7

Related Work

4Distributed databases • XPRS, Gamma, Teradata, Tandems NonStopSQL 4Distributed filesystems • NFS, AFS, xFS 4Drawbacks: • Made for stable, well connected environments • Crashed node eventually replaced • Global system view

© Fraunhofer IPSI 8

Seite 4

4 Related Word (contd.)

4Existing P2P storages • OceanStore, Freenet, GNUNet, Freehaven 4Drawbacks: • Coarse-grained granularity • Lack of relationships among stored objects • Updating difficulties • Limited querying possibilities

© Fraunhofer IPSI 9

Proposed Solution

4XML documents are spread in the community 4Peers store only document parts 4Documents are modified by the community during the system run-time 4P2P XML datastore mimics DOM interface 4Every XML document has a tree representation 4Tree structures can be represented using hash table structures 4Distributed tree -> DHT

© Fraunhofer IPSI 10

Seite 5

5 P2P Datastore challenges

The same like in classical databases 4Durability 4Consistency 4Reliability 4Concurrency and transcations 4Security 4Scalability

© Fraunhofer IPSI 11

Tree Operations

4Creating a root node • Who will create the root? 4Getting a node • Figuring out what is the current value 4Updating a node • Propagate changes to all replicas 4Adding a node 4Removing a node

Node reference is DHT key

© Fraunhofer IPSI 12

Seite 6

6 DOM Particularities

Serialized object (XML) Serialized object (DOM) <song> Title Name Author Name <author> <publisher> </author> <publisher> Publisher Name </publisher> </song> Title Name Author Name Publisher Name</p><p>4Serialized objects are DOM sub-trees 4Managing all nodes separately has drawbacks: • Complicated undo • Objects spread across many peers • Objects consume larger portion of key space</p><p>© Fraunhofer IPSI 13</p><p>Querying</p><p>4XPath and XQuery use DOM for accessing XML documents 4It is possible to apply them directly on the top of proposed storage 4Index structures are needed to get decent performances</p><p>© Fraunhofer IPSI 14</p><p>Seite 7</p><p>7 System Architecture</p><p>Applications</p><p>Query Engine</p><p>P2P-DOM Index Manager</p><p>DHT Abstraction Layer</p><p>Local Local storage DHT storage</p><p>Network Layers</p><p>© Fraunhofer IPSI 15</p><p>Summary</p><p>Peer-to-peer is a hype today</p><p>Peer-to-Peer can be a powerfull architecture 4Scalability, availability, flexibility, ...</p><p>But we need a more sophisticated data managment 4Proposed P2P DOM for datastorage</p><p>© Fraunhofer IPSI 16</p><p>Seite 8</p><p>8</p> </div> </article> </div> </div> </div> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.6.1/jquery.min.js" crossorigin="anonymous" referrerpolicy="no-referrer"></script> <script> var docId = '6edc86bee1984105e0e26deefb51ac0d'; var endPage = 1; var totalPage = 8; var pfLoading = false; window.addEventListener('scroll', function () { if (pfLoading) return; var $now = $('.article-imgview .pf').eq(endPage - 1); if (document.documentElement.scrollTop + $(window).height() > $now.offset().top) { pfLoading = true; endPage++; if (endPage > totalPage) return; var imgEle = new Image(); var imgsrc = "//data.docslib.org/img/6edc86bee1984105e0e26deefb51ac0d-" + endPage + (endPage > 3 ? ".jpg" : ".webp"); imgEle.src = imgsrc; var $imgLoad = $('<div class="pf" id="pf' + endPage + '"><img src="/loading.gif"></div>'); $('.article-imgview').append($imgLoad); imgEle.addEventListener('load', function () { $imgLoad.find('img').attr('src', imgsrc); pfLoading = false }); if (endPage < 5) { adcall('pf' + endPage); } } }, { passive: true }); if (totalPage > 0) adcall('pf1'); </script> <script> var sc_project = 11552861; var sc_invisible = 1; var sc_security = "b956b151"; </script> <script src="https://www.statcounter.com/counter/counter.js" async></script> </html>