Design and Initial Implementation of a Decentralized Social Networking Application Using Apache Cassandra and Peer to Peer Data

Total Page:16

File Type:pdf, Size:1020Kb

Design and Initial Implementation of a Decentralized Social Networking Application Using Apache Cassandra and Peer to Peer Data Design and Initial Implementation of a Decentralized Social Networking Application Using Apache Cassandra and Peer to Peer Data Transfers Campbell Boswell and Rowen Felt, Advised by Peter Johnson Middlebury College, Department of Computer Science Summer 2018 Summary Our objective for this summer’s work was to create the infrastructure to support a decentralized social networking application. A decentralized application has several benefits over a centralized system, the most important being data privacy, security and platform independence. Centralized applications, such as Facebook and Instagram, profit by datamining users for marketing information. In a decentralized system however, users own their data and share the operating costs of the network by managing their own data stores and networking overhead which eliminates the need for corporate ownership We began by reading a report from our advisor’s previous summer research session (2017), which was produced by student researchers. This report presented the initial conceptual research for the project, and included an analysis of user operations performed in a variety of social networking contexts and applications. After discussing the report and exploring several academic papers related to goals of the project, we decided that the nexts steps would be designing a distributed application oriented around an underlying distributed hash table. Our advisor left the specific design and behavior of the system for us to define and implement ourselves. A distributed hash table (or DHT) is “a decentralized, distributed system that provides a lookup service very similar to a hash table”.1 Most DHTs describe a system of nodes connected in a ring-like graph such that each node is responsible for a given range of hashed keys. Each node is aware of a certain number of other nodes throughout the ring and maintains some notion of the overall state of the network and distribution of keys. When key-value pairs are inserted into the DHT, a logarithmic-time search function similar to binary search allows the inserting node to locate the node responsible for the hashed key and pass the appropriate value to be inserted. Looking up values by their corresponding keys functions in the same manner, except values are retrieved rather than inserted. Most DHTs include support for mirroring data on other nodes in case of system error and provide functionality for nodes to join and leave the network with minimal computational overhead. DHTs allow for consistent, reliable access to a large amount of data than might be unmanageable on a single server. There are several notable benefits of DHTs, including scalability, fault tolerance, and flexibility. Because each node maintains only knowledge of a constant or logarithmic number of other nodes, and because each node is responsible for only a portion of the overall data, a DHT can easily scale to thousands or millions of nodes and billions or trillions of data points. DHTs 1 https://en.wikipedia.org/wiki/Distributed_hash_table are also fault tolerant in that most systems allow nodes to join and leave the network without a substantial penalty in locating relevant data points. This fault tolerance is invaluable in both the context of a datacenter, where heavy traffic consistently leads to node failure, and in the context of nodes distributed across a wide user base, where nodes often disconnect due to network connectivity and maintenance. Many DHTs are also highly usable regardless of network topology or physical proximity. This flexibility has proven invaluable in previous iterations of decentralized social networking applications which also used DHTs. After two weeks of reading research publications concerning various DHTs and distributed system implementations, we decided to recreate the functionality of an existing social networking platform through our own design. We chose Instagram because its primary operations, such as making posts, writing comments, tagging users, and sending direct messages, seemed fairly straightforward. We began by defining Instagram’s user actions and decomposing them into computational operations on user profile objects and user content. At this point we decided that the difference between posts, comments, shares, tags, and messages are trivial enough that all of these data points can be abstracted into a single object class that we chose to call a dispatch. The dispatch object is composed of all the fields needed ​ ​ to describe any of the above content, including such fields as the user id, image data, text, tags, ​ user tags, audience, and a globally unique dispatch id. The dispatch object also contains the ​ ​ ​ ​ ​ parent type and parent id fields to identify the dispatch as either a post or comment. In the case ​ ​ ​ of a post, the parent id would be the user id of the poster. In the case of a comment, the parent ​ ​ ​ ​ ​ id would be the dispatch id of the original post which is being commented upon. The audience ​ ​ ​ ​ field can also be used to specify the type of dispatch as either a public post, direct message, or group message, in which case the audience field would be populated with the user ids of ​ ​ ​ ​ relevant parties. Using this new abstraction, we were able to break down all user actions into operations on dispatch objects and user objects. When the time came to implement our design, we had to decide which pre existing DHT we would use to store global user identifiers and what kind of local database we would use to store user data. Fortunately, the operations and measured efficiency of most DHTs are essentially equivalent, so we could choose an implementation based on the language we wanted to use and the support provided. We chose to write all of the server software for this application in C because this project presented a good opportunity to gain familiarity with the language and work with third-party C libraries. Due to this decision, we chose Apache Cassandra as our DHT for it’s extensive API and C driver support. We similarly chose MongoDB for our local database because of its noteable speed, document-based flexibility, and substantial support for C drivers. We spent the last four weeks of the summer coding our implementation. We started by writing C libraries for storing, retrieving, and updating user identification information in the Apache Cassandra database. We then wrote C libraries for insertion, deletion, and search methods on user and dispatch objects in the MongoDB database. The majority of this code was contained in methods which converted user and dispatch structs to BSON then JSON formats and vice versa. We then decided to write our own application-layer networking protocol to facilitate the peer-to-peer communication which would comprise the bulk of network overhead. We chose to make this protocol text-based for ease of testing and because the data being transferred between instances of MongoDB was conveniently stored in the text based JSON format. We used these networking protocols to describe another layer of abstraction that more closely resembled real user actions. These protocols implemented behavior such as pushing a user object to a node or pulling all dispatch objects with a given field. On top of this layer we were able to build a server that responds to incoming requests and a client that reads protocol commands from a file. Our last project was writing python wrapper functions that describe individual user actions, such as making a post, sending a message, or viewing a profile. These functions format the appropriate information as network protocol commands to be executed by the client process. We then wrote testing infrastructure that would randomly generate an arbitrary number of user actions with a given distribution of probability and execute them sequentially. While we were able to test a large number of inputs without system failure, we were unable to truly test the system for a variety of reasons. We were unable to acquire a dataset of user actions with the data we required, and we had no dataset representing performance standards for centralized social networks to which we could compare the performance of our system. Additionally, all of the nodes to which we had access are operating on a local area network, which is not an accurate representation of how the system would be deployed in the wild, and we had neither the time nor resources to simulate realistic network topology on the systems available to us. Ultimately, our aim was to leave this project in a clean state with concise documentation so that others may continue the work in the future with relative ease. With that in mind, the system model, setup, API, testing infrastructure, and proposals for future work have been outlined below. Reference past work (soup, reclaim) We began our project by researching implementations of DHT’s, peer-to-peer technologies, and past attempts at decentralized social networks (SOUP and ReClaim). Listed below are the publications we reviewed. Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications, Ion Stoica, et al, ​ 2001 Link: https://pdos.csail.mit.edu/papers/chord:sigcomm01/chord_sigcomm.pdf ​ Democratizing Content Publication with Coral, Michael J. Freedman, et al, 2004 ​ Link: http://www.coralcdn.org/docs/coral-nsdi04.pdf ​ SOUP: an Online Social Network by the People, for the People, David Knoll, et al, 2014 ​ Link: https://dl.acm.org/citation.cfm?id=2663324 ​ ​ HyperDex: A Distributed, Searchable Key-Value Store, Robert Escriva et al, 2012 ​ Link: http://conferences.sigcomm.org/sigcomm/2012/paper/sigcomm/p25.pdf
Recommended publications
  • Document Databases, JSON, Mongodb 17
    MI-PDB, MIE-PDB: Advanced Database Systems http://www.ksi.mff.cuni.cz/~svoboda/courses/2015-2-MIE-PDB/ Lecture 13: Document Databases, JSON, MongoDB 17. 5. 2016 Lecturer: Martin Svoboda [email protected] Authors: Irena Holubová, Martin Svoboda Faculty of Mathematics and Physics, Charles University in Prague Course NDBI040: Big Data Management and NoSQL Databases Document Databases Basic Characteristics Documents are the main concept Stored and retrieved XML, JSON, … Documents are Self-describing Hierarchical tree data structures Can consist of maps, collections, scalar values, nested documents, … Documents in a collection are expected to be similar Their schema can differ Document databases store documents in the value part of the key-value store Key-value stores where the value is examinable Document Databases Suitable Use Cases Event Logging Many different applications want to log events Type of data being captured keeps changing Events can be sharded by the name of the application or type of event Content Management Systems, Blogging Platforms Managing user comments, user registrations, profiles, web-facing documents, … Web Analytics or Real-Time Analytics Parts of the document can be updated New metrics can be easily added without schema changes E-Commerce Applications Flexible schema for products and orders Evolving data models without expensive data migration Document Databases When Not to Use Complex Transactions Spanning Different Operations Atomic cross-document operations Some document databases do support (e.g., RavenDB) Queries against Varying Aggregate Structure Design of aggregate is constantly changing → we need to save the aggregates at the lowest level of granularity i.e., to normalize the data Document Databases Representatives Lotus Notes Storage Facility JSON JavaScript Object Notation Introduction • JSON = JavaScript Object Notation .
    [Show full text]
  • Smart Grid Serialization Comparison
    Downloaded from orbit.dtu.dk on: Sep 28, 2021 Smart Grid Serialization Comparison Petersen, Bo Søborg; Bindner, Henrik W.; You, Shi; Poulsen, Bjarne Published in: Computing Conference 2017 Link to article, DOI: 10.1109/SAI.2017.8252264 Publication date: 2017 Document Version Peer reviewed version Link back to DTU Orbit Citation (APA): Petersen, B. S., Bindner, H. W., You, S., & Poulsen, B. (2017). Smart Grid Serialization Comparison. In Computing Conference 2017 (pp. 1339-1346). IEEE. https://doi.org/10.1109/SAI.2017.8252264 General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Users may download and print one copy of any publication from the public portal for the purpose of private study or research. You may not further distribute the material or use it for any profit-making activity or commercial gain You may freely distribute the URL identifying the publication in the public portal If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Computing Conference 2017 18-20 July 2017 | London, UK Smart Grid Serialization Comparision Comparision of serialization for distributed control in the context of the Internet of Things Bo Petersen, Henrik Bindner, Shi You Bjarne Poulsen DTU Electrical Engineering DTU Compute Technical University of Denmark Technical University of Denmark Lyngby, Denmark Lyngby, Denmark [email protected], [email protected], [email protected] [email protected] Abstract—Communication between DERs and System to ensure that the control messages are received within a given Operators is required to provide Demand Response and solve timeframe, depending on the needs of the power grid.
    [Show full text]
  • Spindle Documentation Release 2.0.0
    spindle Documentation Release 2.0.0 Jorge Ortiz, Jason Liszka June 08, 2016 Contents 1 Thrift 3 1.1 Data model................................................3 1.2 Interface definition language (IDL)...................................4 1.3 Serialization formats...........................................4 2 Records 5 2.1 Creating a record.............................................5 2.2 Reading/writing records.........................................6 2.3 Record interface methods........................................6 2.4 Other methods..............................................7 2.5 Mutable trait...............................................7 2.6 Raw class.................................................7 2.7 Priming..................................................7 2.8 Proxies..................................................8 2.9 Reflection.................................................8 2.10 Field descriptors.............................................8 3 Custom types 9 3.1 Enhanced types..............................................9 3.2 Bitfields..................................................9 3.3 Type-safe IDs............................................... 10 4 Enums 13 4.1 Enum value methods........................................... 13 4.2 Companion object methods....................................... 13 4.3 Matching and unknown values...................................... 14 4.4 Serializing to string............................................ 14 4.5 Examples................................................. 14 5 Working
    [Show full text]
  • Rdf Repository Replacing Relational Database
    RDF REPOSITORY REPLACING RELATIONAL DATABASE 1B.Srinivasa Rao, 2Dr.G.Appa Rao 1,2Department of CSE, GITAM University Email:[email protected],[email protected] Abstract-- This study is to propose a flexible enable it. One such technology is RDF (Resource information storage mechanism based on the Description Framework)[2]. RDF is a directed, principles of Semantic Web that enables labelled graph for representing information in the information to be searched rather than queried. Web. This can be perceived as a repository In this study, a prototype is developed where without any predefined structure the focus is on the information rather than the The information stored in the traditional structure. Here information is stored in a RDBMS’s requires structure to be defined structure that is constructed on the fly. Entities upfront. On the contrary, information could be in the system are connected and form a graph, very complex to structure upfront despite the similar to the web of data in the Internet. This tremendous potential offered by the existing data is persisted in a peculiar way to optimize database systems. In the ever changing world, querying on this graph of data. All information another important characteristic of information relating to a subject is persisted closely so that in a system that impacts its structure is the reqeusting any information of a subject could be modification/enhancement to the system. This is handled in one call. Also, the information is a big concern with many software systems that maintained in triples so that the entire exist today and there is no tidy approach to deal relationship from subject to object via the with the problem.
    [Show full text]
  • CBOR (RFC 7049) Concise Binary Object Representation
    CBOR (RFC 7049) Concise Binary Object Representation Carsten Bormann, 2015-11-01 1 CBOR: Agenda • What is it, and when might I want it? • How does it work? • How do I work with it? 2 CBOR: Agenda • What is it, and when might I want it? • How does it work? • How do I work with it? 3 Slide stolen from Douglas Crockford History of Data Formats • Ad Hoc • Database Model • Document Model • Programming Language Model Box notation TLV 5 XML XSD 6 Slide stolen from Douglas Crockford JSON • JavaScript Object Notation • Minimal • Textual • Subset of JavaScript Values • Strings • Numbers • Booleans • Objects • Arrays • null Array ["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"] [ [0, -1, 0], [1, 0, 0], [0, 0, 1] ] Object { "name": "Jack B. Nimble", "at large": true, "grade": "A", "format": { "type": "rect", "width": 1920, "height": 1080, "interlace": false, "framerate": 24 } } Object Map { "name": "Jack B. Nimble", "at large": true, "grade": "A", "format": { "type": "rect", "width": 1920, "height": 1080, "interlace": false, "framerate": 24 } } JSON limitations • No binary data (byte strings) • Numbers are in decimal, some parsing required • Format requires copying: • Escaping for strings • Base64 for binary • No extensibility (e.g., date format?) • Interoperability issues • I-JSON further reduces functionality (RFC 7493) 12 BSON and friends • Lots of “binary JSON” proposals • Often optimized for data at rest, not protocol use (BSON ➔ MongoDB) • Most are more complex than JSON 13 Why a new binary object format? • Different design goals from current formats – stated up front in the document • Extremely small code size – for work on constrained node networks • Reasonably compact data size – but no compression or even bit-fiddling • Useful to any protocol or application that likes the design goals 14 Concise Binary Object Representation (CBOR) 15 “Sea Boar” “Sea Boar” 16 Design goals (1 of 2) 1.
    [Show full text]
  • JSON JSON (Javascript Object Notation) Ecmascript Javascript
    ECMAScript JSON ECMAScript is standardized JavaScript. The current version is the 12th edition: Péter Jeszenszky Ecma International, ECMAScript 2021 Language Specification, Standard ECMA-262, 12th ed., June 2021. https://www.ecma- international.org/publications-and-standards/standards/ecma-262/ October 8, 2021 The next version currently under development is ECMAScript 2022: ECMAScript 2022 Language Specification https://tc39.es/ecma262/ Péter Jeszenszky JSON October 8, 2021 1 / 94 Péter Jeszenszky JSON October 8, 2021 3 / 94 JSON (JavaScript Object Notation) JavaScript Lightweight, textual, and platform independent data exchange format. Used for representing structured data. The term JavaScript is used for the implementations of ECMAScript by different vendors. Can be read and written easily by humans. See also: JavaScript technologies overview Can be generated and processed easily by computer programs. https://developer.mozilla.org/en- US/docs/Web/JavaScript/JavaScript_technologies_overview Originates from the ECMAScript programming language. Website: https://www.json.org/ Péter Jeszenszky JSON October 8, 2021 2 / 94 Péter Jeszenszky JSON October 8, 2021 4 / 94 JavaScript Engines (1) Node.js (1) SpiderMonkey (written in: C/C++; license: Mozilla Public License 2.0) https://spidermonkey.dev/ A JavaScript runtime environment built on the V8 JavaScript engine The JavaScript engine of the Mozilla Project. that is designed to build scalable network applications. V8 (written in: C++; license: New BSD License) https://v8.dev/ Website: https://nodejs.org/ https://github.com/nodejs/node https://github.com/v8/v8/ License: MIT License The JavaScript engine of Chromium. Written in: C++, JavaScript JavaScriptCore (written in: C++; license: LGPLv2) https://developer.apple.com/documentation/javascriptcore https: Platform: Linux, macOS, Windows //github.com/WebKit/webkit/tree/master/Source/JavaScriptCore The JavaScript engine developed for the WebKit rendering engine.
    [Show full text]
  • Stupeflask Documentation
    Stupeflask Documentation Numberly Jan 20, 2020 Contents 1 Install 3 2 Comparison 5 3 Tests 7 4 License 9 5 Documentation 11 5.1 Better application defaults........................................ 11 5.2 Easier collection of configuration values................................. 13 5.3 Native ObjectId support......................................... 13 5.4 API Reference.............................................. 14 5.5 Cursor support.............................................. 18 6 Indices and tables 19 Python Module Index 21 Index 23 i ii Stupeflask Documentation a.k.a. « Flask on steroids » An opinionated Flask extension designed by and for web developers to reduce boilerplate code when working with Marshmallow, MongoDB and/or JSON. Documentation: https://flask-stupe.readthedocs.io • Return any object type in views, and it will be coerced to a flask.Response • Validate payloads through Marshmallow schemas • Easily add JSON converters for any custom type • Fetch all the blueprints from a whole module in one line • Native ObjectId support for both Flask and Marshmallow • Powerful configuration management • Decorators to handle authentication, permissions, and pagination • 100% coverage and no dependency Contents 1 Stupeflask Documentation 2 Contents CHAPTER 1 Install $ pip install flask-stupe 3 Stupeflask Documentation 4 Chapter 1. Install CHAPTER 2 Comparison Here is a comparison of a bare Flask application and its equivalent Stupeflask version. They both rely on MongoDB, handle input and output in JSON, and allow to create a user and retrieve
    [Show full text]
  • Advanced JSON Handling in Go 19:40 05 Mar 2020 Jonathan Hall Devops Evangelist / Go Developer / Clean Coder / Salsa Dancer About Me
    Advanced JSON handling in Go 19:40 05 Mar 2020 Jonathan Hall DevOps Evangelist / Go Developer / Clean Coder / Salsa Dancer About me Open Source contributor; CouchDB PMC, author of Kivik Core Tech Lead for Lana Former eCommerce Dev Manager at Bugaboo Former backend developer at Teamwork.com Former backend developer at Booking.com Former tech lead at eFolder/DoubleCheck 2 Show of hands Who has... ...used JSON in a Go program? ...been frustrated by Go's strict typing when dealing with JSON? ...felt limited by Go's standard JSON handling? What have been your biggest frustrations? 3 Today's Topics Very brief intro to JSON in Go Basic use of maps and structs Handling inputs of unknown type Handling data with some unknown fields 4 A brief intro to JSON JavaScript Object Notation, defined by RFC 8259 Human-readable, textual representation of arbitrary data Limted types: null, Number, String, Boolean, Array, Object Broad applications: Config files, data interchange, simple messaging 5 Alternatives to JSON YAML, TOML, INI BSON, MessagePack, CBOR, Smile XML ProtoBuf Custom/proprietary formats Many principles discussed in this presentation apply to any of the above formats. 6 Marshaling JSON Creating JSON from a Go object is (usually) very straight forward: func main() { x := map[string]string{ "foo": "bar", } data, _ := json.Marshal(x) fmt.Println(string(data)) } Run 7 Marshaling JSON, #2 Creating JSON from a Go object is (usually) very straight forward: func main() { type person struct { Name string `json:"name"` Age int `json:"age"` Description string `json:"descr,omitempty"` secret string // Unexported fields are never (un)marshaled } x := person{ Name: "Bob", Age: 32, secret: "Shhh!", } data, _ := json.Marshal(x) fmt.Println(string(data)) } Run 8 Unmarshaling JSON Unmarshaling JSON is often a bit trickier.
    [Show full text]
  • JSON As an XML Alternative
    JSON The Fat-Free Alternative to XML { “Lecture”: 27, “Course”: “CSC375”, “Days”: ”TTh", “Instructor”: “Haidar Harmanani” } JSON as an XML Alternative • JSON is a light-weight alternative to XML for data- interchange • JSON = JavaScript Object Notation – It’s really language independent – most programming languages can easily read it and instantiate objects or some other data structure • Defined in RFC 4627 • Started gaining tracking ~2006 and now widely used • http://json.org/ has more information JSON as an XML Alternative • What is JSON? – JSON is language independent – JSON is "self-describing" and easy to understand – *JSON uses JavaScript syntax for describing data objects, but JSON is still language and platform independent. JSON parsers and JSON libraries exists for many different programming languages. • JSON -Evaluates to JavaScript Objects – The JSON text format is syntactically identical to the code for creating JavaScript objects. – Because of this similarity, instead of using a parser, a JavaScript program can use the built-in eval() function and execute JSON data to produce native JavaScript objects. Example {"firstName": "John", l This is a JSON object "lastName" : "Smith", "age" : 25, with five key-value pairs "address" : l Objects are wrapped by {"streetAdr” : "21 2nd Street", curly braces "city" : "New York", "state" : "NY", l There are no object IDs ”zip" : "10021"}, l Keys are strings "phoneNumber": l Values are numbers, [{"type" : "home", "number": "212 555-1234"}, strings, objects or {"type" : "fax", arrays "number” : "646 555-4567"}] l Aarrays are wrapped by } square brackets The BNF is simple When to use JSON? • SOAP is a protocol specification for exchanging structured information in the implementation of Web Services.
    [Show full text]
  • Open Source Used in Cisco DNA Center Release 1.3.X
    Open Source Used In Cisco DNA Center NXP 1.3.1.0 Cisco Systems, Inc. www.cisco.com Cisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco website at www.cisco.com/go/offices. Text Part Number: 78EE117C99-201847078 Open Source Used In Cisco DNA Center 1.3.1.0 1 This document contains licenses and notices for open source software used in this product. With respect to the free/open source software listed in this document, if you have any questions or wish to receive a copy of any source code to which you may be entitled under the applicable free/open source license(s) (such as the GNU Lesser/General Public License), please contact us at [email protected]. In your requests please include the following reference number 78EE117C99-201847078 Contents 1.1 1to2 1.0.0 1.1.1 Available under license 1.2 @amcharts/amcharts3-react 2.0.7 1.2.1 Available under license 1.3 @babel/code-frame 7.0.0 1.3.1 Available under license 1.4 @babel/highlight 7.0.0 1.4.1 Available under license 1.5 @babel/runtime 7.3.4 1.5.1 Available under license 1.6 @mapbox/geojson-types 1.0.2 1.6.1 Available under license 1.7 @mapbox/mapbox-gl-style-spec 13.3.0 1.7.1 Available under license 1.8 @mapbox/mapbox-gl-supported 1.4.0 1.8.1 Available under license 1.9 @mapbox/whoots-js 3.1.0 1.9.1 Available under license 1.10 abab 2.0.0 1.10.1 Available under license 1.11 abbrev 1.1.1 1.11.1 Available under license 1.12 abbrev 1.1.0 1.12.1 Available under license 1.13 absurd 0.3.9 1.13.1 Available under license
    [Show full text]
  • Better Than Protocol Buffers
    Better Than Protocol Buffers Grouty Jae never syntonises so pridefully or proselytise any centenarian senselessly. Rawish Ignacius whenreindustrializes eastwardly his Bentley deepening hummings awake gruesomely credulously. and Rab pragmatically. usually delights furthest or devoiced mundanely Er worden alleen cookies on windows successfully reported this variable on tag per field encoding than protocol buffers, vs protocol buffers Protocol buffers definitely seem like protocol buffers objects have to. Strong supporter of STEM education. Then i objectively agree to better than json with. There is, of course, a catch: The results can only be used as part of a new request sent to the same server. Avro is a clear loser. Add Wisdom Geek to your Homescreen! Google may forecast the Protocol Buffers performance in their future version. Matches my approach and beliefs. To keep things simple a ass is situate in hebrew new frameworks. These different protocols that protocol buffers and better this role does this technology holding is doing exactly what. Proto files which i comment if you want to better than jdk integer should check when constrained to. How can Use Instagram? The JSON must be serialized and converted into the target programming language both on the server side and client side. Get the latest posts delivered right to your inbox. Sound off how much better than protocol. This broad a wildly inaccurate statement. Protobuf protocol buffers are better than words, we developed in json on protocols to medium members are sending data access key and its content split into many languages? That project from descriptor objects that this makes use packed binary encodings, you are nice, none of defence and just an attacker to.
    [Show full text]
  • Efficient XML Interchange for Afloat Networks
    LT Bruce Hill MAKING BIG FILES SMALL AND SMALL FILES TINY 1 XML and JSON ● JavaScript Object Notation (JSON) is a common alternative to XML in web applications ● JSON is a plaintext data-interchange format based on JavaScript code ● JSON has compact binary encodings analogous to EXI: ○ CBOR ○ BSON ● Research Question: Is EXI more compact than CBOR and BSON? 2 EXI for Large XML Files ● W3C and previous NPS research measured EXI performance on XML up to 100MB ● Large data dumps can easily exceed that ● Research Question: How does EXI (but not CBOR/BSON) perform on files from 100MB - 4GB? 3 Methods Use Case Focus Configuration Focus ● Compression results ● EXI has many across multiple use configuration options cases look different from that affect results for multiple files ● Compactness within a single use case ● Processing speed ● Select a few use cases ● Memory footprint ● Fidelity and study them in-depth ● XML Schema affects EXI compression as well 4 Encodings Compared Small Files Large Files When in doubt, try every possible combination of options 5 Small-file Use Cases (B to KB) ● OpenWeatherMap ● Automated Identification ● Global Position System System (AIS) XML (GPX) 6 AIS Use Case EXI smaller than CBOR/BSON, aggregating data helps 7 AIS Use Case Well-designed XML Schema improves performance 8 Large-file Use Cases (KB to GB) ● Digital Forensics XML ● Packet Description (DFXML) Markup Language ● OpenStreetMap (PDML) 9 PDML Use Case EXI performs well on large files, aggregation benefits plateau 10 EXI and MS Office ● Microsoft Office is ubiquitous in Navy/DoD ● Since 2003, the file format has been a Zipped archive of many small XML files ● Since 2006, the file format has been an open standard ● Since 2013, MS Office 365 can save in compliant format ● Tools such as NXPowerLite target excess image resolution and metadata to shrink them ● EXI can target the remainder..
    [Show full text]