Graph Databases Comparison: Allegrograph, Arangodb, Infinitegraph, Neo4j, and Orientdb

Total Page:16

File Type:pdf, Size:1020Kb

Graph Databases Comparison: Allegrograph, Arangodb, Infinitegraph, Neo4j, and Orientdb Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB Diogo Fernandes1 and Jorge Bernardino1,2 1Polytechnic of Coimbra - ISEC, Rua Pedro Nunes, Coimbra, Portugal 2CISUC – Centre for Informatics and Systems of the University of Coimbra, Portugal Keywords: Graph databases, NoSQL databases, AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, OrientDB. Abstract: Graph databases are a very powerful solution for storing and searching for data designed for data rich in relationships, such as Facebook and Twitter. With data multiplication and data type diversity there has been a need to create new storage and analysis platforms that structure irregular data with a flexible schema, maintaining a high level of performance and ensuring data scalability effectively, which is a problem that relational databases cannot handle. In this paper, we analyse the most popular graph databases: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J and OrientDB. We study the most important features for a complete and effective application, such as flexible schema, query language, sharding and scalability. 1 INTRODUCTION model relationships between objects. In this context, a graph is basically a structure, which is represented Nowadays, data volume is growing exponentially in by nodes, or also called vertices (the entities), by social networks, such as Facebook and Twitter, edges (the relations) that are the lines that connect which store and process massive amounts of data the various nodes and by properties (attributes). daily, where this data reaches petabytes of storage. Therefore, the graph databases can simply be Current relational databases are predominant in the described as a way of representing and storing data, market, but their adaptability is poor in processing using their structures: nodes, edges, and properties. connected data. Actual platforms must deal with For that reason, graph databases are optimized for huge amounts of data and related information. For storing and querying graphs. this reason, a special type of databases designated by The problem with graph databases is that they graph databases have appeared. are not particularly efficient in all desired operations A graph database is a non-relational database such as in the representation of data that are derived that provides an effective and efficient solution for from relational models. Therefore, they do not the storage of information in current scenarios, replace relational databases but are in fact an where data is increasingly interconnected. The efficient solution when dealing with huge volumes storage mechanisms of graph databases are of data that contain many related data. optimized for graphing, for the way they store The focus of this paper is the study of the main adjacent records linked by direct references. In this characteristics, advantages and use cases of graph adjacency list, each vertex maintains references to databases. We study five of the most popular graph their adjacent vertices, forming an index species for databases: AllegroGraph, ArangoDB, InfiniteGraph, the vertices on neighbourhood. This property is Neo4J and OrientDB. known as index-free adjacency (Robinson et al., The rest of this paper is structured as follows. 2015). Interest in graph models has been increasing Section 2 describes the advantages and uses of graph in recent years due to their applications in areas like databases. Sections 3 to 7 describe AllegroGraph, Semantic Web and Social Network Analysis ArangoDB, Infinite Graph, Neo4J and OrientDB, (Dietrich et al., 2008). This type of database is easy respectively. Section 8 presents the comparison of to understand because its concept is based on the the five graph databases. Finally, Section 9 presents theory of graphs. This theory is basically based on the main conclusions and future work. graphs, which are mathematical structures used to 373 Fernandes, D. and Bernardino, J. Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB. DOI: 10.5220/0006910203730380 In Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA 2018), pages 373-380 ISBN: 978-989-758-318-6 Copyright © 2018 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved DATA 2018 - 7th International Conference on Data Science, Technology and Applications 2 MAIN ADVANTAGES AND contrast, graph database performance stays constant even as our data grows year after year. USES OF GRAPH DATABASES Flexibility, because IT and data architect teams move at the speed of business because the structure Graph databases are getting popular as data and schema of a graph model adjusts itself as increases in scalability and relationships become the applications and industries change. Rather than first choice over the relational databases. Ironically, exhaustively modeling a domain ahead of time, data legacy relational database management systems teams can add to the existing graph structure without (RDBMS) are poor at handling relationships endangering current functionality. between data points. Their tabular data models and Finally, agility because developing with graph rigid schemas make it difficult to add new or databases aligns perfectly with today’s agile, test- different kinds of connections. Graphs are the future. driven development practices, allowing graph Not only do graph databases effectively store the database to evolve in step with the rest of the relationships between data points, but they are also application and any changing business requirements. flexible in adding new kinds of relationships or However, as any new technology replacing old adapting a data model to new business requirements technology, there are still obstacles in adopting (Webber et al, 2015). Graph databases did not see a graph databases. One is that there are fewer qualified greater advantage over relational databases until developers in the job market than the SQL recent years, when frequent schema changes, developers. Another is the non-standardization of the managing enormous volume of data, real-time query graph database query language (Wu, 2017). response time and more intelligent data activation In the next sections, we analyze five of the most requirements, make people realize the advantages of popular graph databases. the graph model. This technology is disrupting many areas, such as supply chain management, e-commerce recommendations, security, fraud detection and 3 AllegroGraph many other areas in advanced data analytics. The main advantages of a graph database are the AllegroGraph is a modern, consistent and persistent following (Guia et al., 2017) and (Neo4J, 2018): Resource Description Framework (RDF) graph database with high performance that is currently in . Better optimization on gathering information use in open source, commercial, and US Department than relational databases; of Defense projects. AllegroGraph is characterized . ACID rules support; by the efficient use of memory by combining disk . Support of the data storage in the order of storage, making it possible to scale up to one billion petabytes (1015); nodes, always maintaining top performance. Allow for new types of data; Basically, it provides services including vision . Easily expandable due to its structural model - building, rapid prototyping and proof-of-concept graph model; development, complete enterprise technology . Suitable for complex and irregular data, solution stack, and best practices to maximize value normally involved in real world; from semantic technologies. Excellent for data mining operations; AllegroGraph provides an architecture through . A high performance while querying deep data the REST protocol, which is an architectural style when compared to relational databases; that consists of a coordinated set of constraints . There is no need to declare the data type for the applied to components, connectors, and data nodes or edges, unlike relational model; elements within a distributed system. Developers . They are very agile in development, since they have developed adapters for the various supported can be easily adapted over time; languages, Sesame Java, Sesame Jena, Python, using . Combination of multiple dimensions to manage proprietary Sesame and Lisp signatures. big data, including time series, geo-dimensions, AllegroGraph’s competitive advantages are the and hierarchies on different dimensions. following (Ranking of Graph DBMS, 2018): In short, the three key advantages are: . Suited to support adhoc queries through performance, flexibility, and agility. SPARQL, Prolog and languages like Performance, because with traditional databases JavaScript; relationship queries will come to a grinding halt as . Sorted quintuple indices that will index every the number and depth of relationships increase. In primary and non-primary field so users never 374 Graph Databases Comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4J, and OrientDB have to worry about whether a certain field is tolerance, huge amount of storage memory and indexed or not; lower cost than other databases. This software has . It is possible to mix Geospatial, Temporal, two versions: Community and Enterprise editions. Social Network Analytics, and Reasoning, all The ArangoDB cluster architecture is a master / in the same query (SPARQL or Prolog); master CP (Consistent and Partition) (Mehra, 2017) . Triple Level Security with Security Filters; model without a single point of failure. With "CP", . Gruff - Graph Visualization,
Recommended publications
  • Practical Parallel Hypergraph Algorithms
    Practical Parallel Hypergraph Algorithms Julian Shun [email protected] MIT CSAIL Abstract v While there has been signicant work on parallel graph pro- 0 cessing, there has been very surprisingly little work on high- e0 performance hypergraph processing. This paper presents v0 v1 v1 a collection of ecient parallel algorithms for hypergraph processing, including algorithms for betweenness central- e1 ity, maximal independent set, k-core decomposition, hyper- v2 trees, hyperpaths, connected components, PageRank, and v2 v3 e single-source shortest paths. For these problems, we either 2 provide new parallel algorithms or more ecient implemen- v3 tations than prior work. Furthermore, our algorithms are theoretically-ecient in terms of work and depth. To imple- (a) Hypergraph (b) Bipartite representation ment our algorithms, we extend the Ligra graph processing Figure 1. An example hypergraph representing the groups framework to support hypergraphs, and our implementa- , , , , , , and , , and its bipartite repre- { 0 1 2} { 1 2 3} { 0 3} tions benet from graph optimizations including switching sentation. between sparse and dense traversals based on the frontier size, edge-aware parallelization, using buckets to prioritize processing of vertices, and compression. Our experiments represented as hyperedges, can contain an arbitrary number on a 72-core machine and show that our algorithms obtain of vertices. Hyperedges correspond to group relationships excellent parallel speedups, and are signicantly faster than among vertices (e.g., a community in a social network). An algorithms in existing hypergraph processing frameworks. example of a hypergraph is shown in Figure 1a. CCS Concepts • Computing methodologies → Paral- Hypergraphs have been shown to enable richer analy- lel algorithms; Shared memory algorithms.
    [Show full text]
  • Graph Database for Collaborative Communities Rania Soussi, Marie-Aude Aufaure, Hajer Baazaoui
    Graph Database for Collaborative Communities Rania Soussi, Marie-Aude Aufaure, Hajer Baazaoui To cite this version: Rania Soussi, Marie-Aude Aufaure, Hajer Baazaoui. Graph Database for Collaborative Communities. Community-Built Databases, Springer, pp.205-234, 2011. hal-00708222 HAL Id: hal-00708222 https://hal.archives-ouvertes.fr/hal-00708222 Submitted on 14 Jun 2012 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Graph Database For collaborative Communities 1, 2 1 Rania Soussi , Marie-Aude Aufaure , Hajer Baazaoui2 1Ecole Centrale Paris, Applied Mathematics & Systems Laboratory (MAS), SAP Business Objects Academic Chair in Business Intelligence 2Riadi-GDL Laboratory, ENSI – Manouba University, Tunis Abstract Data manipulated in an enterprise context are structured data as well as un- structured data such as emails, documents, social networks, etc. Graphs are a natural way of representing and modeling such data in a unified manner (Structured, semi-structured and unstructured ones). The main advantage of such a structure relies in the dynamic aspect and the capability to represent relations, even multiple ones, between objects. Recent database research work shows a growing interest in the definition of graph models and languages to allow a natural way of handling data appearing.
    [Show full text]
  • ROBRA – the Snake Robot
    International Journal of Computer Science Trends and Technology (IJCST) – Volume 7 Issue 3, May - Jun 2019 RESEARCH ARTICLE OPEN ACCESS ROBRA – The Snake Robot James M.V, Nandukrishna S, Noyal Jose, Ebin Biju Department of Computer Science and Engineering Toc H Institute of Science & Technology, Ernakulam Kerala – India ABSTRACT Robra - A snake-like robot which, provides the locomotion of a real snake. The shape and size of the robot depends on application. Robra can avoid obstacles by receiving signals from sensors and move with flexibility on terrain surfaces. Multiple joints increases its degree of freedom. It can quickly explore and navigate foreign environments and detect potential dangers for direct human involvement. Robra uses PIR() to detect presence of humans in an environment like civilians trapped in collapsed buildings during natural disasters. It can move through crevices and uneven surfaces. It detects obstacles by using Ultrasonic range movement sensors. This provides noncontact distance between the obstacles to detect and avoid collision with them. Electro chemical sensors are used for detection of poisonous gases. Robra has an onboard camera which records and send the video data in real time via WiFi. Robra has microphones for the survivors to communicate and inform their status incase of rescue aid. Thus Robra covers wide real time applications like surveillance, rescue aid etc. Keywords:- Robra, Snake, act have a wingspan almost 10 feet, giving it an ape-like I. INTRODUCTION appearance. In the past two decades it is estimated that disasters are Momaro - This robot has been specifically designed by the responsible for about 3 million deaths worldwide, 800million team NimbRo Rescue from the University of Bonn in people adversely affected.
    [Show full text]
  • Table of Content
    TABLE OF CONTENT 1. What is Joomla………………………………………………………………. 2 2. Download and Install Joomla………………………………………………. 3 2.1 Pre-installation Check………………………………………………….... 3 2.2 Configuration …………………………………………………………… 4 3. Creating Content……………………………………………………………....9 3.1 Create a Category…………………………………………………………9 3.2 Create Article………………………………………………………………10 4. Modules……………………………………………………………………….. 12 5. How to Show Position for Template?....................................................... 14 6. Menu……………………………………………………………………………16 6.1 How to Create Menu?.........................................................................16 7. User……………………………………………………………………………. 18 7.1 Users Groups…………………………………………………………….. 18 7.2 Add New User……………………………………………………………..19 1 1. What is Joomla? Joomla is a free system for creating websites. It is an open source project, which, like most open source projects, is constantly in motion. It has been extremely successful for seven years now and is popular with millions of users worldwide. The word Joomla is a derivative of the word Joomla from the African language Swahili and means "all together". The project Joomla is the result of a heated discussion between the Mambo Foundation, which was founded in August 2005, and its then-development team. Joomla is a development of the successful system Mambo. Joomla is used all over the world for simple homepages and for complex corporate websites as well. It is easy to install, easy to manage and very reliable. The Joomla team has organized and reorganized itself throughout the last seven years to better meet the user demands. 2 2. Download and Install Joomla “Where and what to download?” “How to install?” In order to install Joomla! on your local computer, it is necessary to set up your "own internet", for which you'll need a browser, a web server, a PHP environment and as well a Joomla supported database system.
    [Show full text]
  • Franz's Allegrograph 7.1 Accelerates Complex Reasoning Across
    Franz’s AllegroGraph 7.1 Accelerates Complex Reasoning Across Massive, Distributed Knowledge Graphs and Data Fabrics Distributed FedShard Queries Improved 10X. New SPARQL*, RDF* and SHACL Features Added. Lafayette, Calif., February 8, 2021 — Franz Inc., an early innovator in Artificial Intelligence (AI) and leading supplier of Graph Database technology for AI Knowledge Graph Solutions, today announced AllegroGraph 7.1, which provides optimizations for deeply complex queries across FedShard™ deployments – making complex reasoning up to 10X faster. AllegroGraph with FedShard offers a breakthrough solution that allows infinite data integration through a patented approach unifying all data and siloed knowledge into an Entity-Event Knowledge Graph solution for Enterprise scale analytics. AllegroGraph’s unique capabilities drive 360-degree insights and enable complex reasoning across enterprise-wide Knowledge Fabrics in Healthcare, Media, Smart Call Centers, Pharmaceuticals, Financial and much more. “The proliferation of Artificial Intelligence has resulted in the need for increasingly complex queries over more data,” said Dr. Jans Aasman, CEO of Franz Inc. “AllegroGraph 7.1 addresses two of the most daunting challenges in AI – continuous data integration and the ability to query across all the data. With this new version of AllegroGraph, organizations can create Data Fabrics underpinned by AI Knowledge Graphs that take advantage of the infinite data integration capabilities possible with our FedShard technology and the ability to query across
    [Show full text]
  • Using Microsoft Power BI with Allegrograph,Knowledge Graphs Rise
    Using Microsoft Power BI with AllegroGraph There are multiple methods to integrate AllegroGraph SPARQL results into Microsoft Power BI. In this document we describe two best practices to automate queries and refresh results if you have a production AllegroGraph database with new streaming data: The first method uses Python scripts to feed Power BI. The second method issues SPARQL queries directly from Power BI using POST requests. Method 1: Python Script: Assuming you know Python and have it installed locally, this is definitely the easiest way to incorporate SPARQL results into Power BI. The basic idea of the method is as follows: First, the Python script enables a connection to your desired AllegroGraph repository. Then we utilize AllegroGraph’s Python API within our script to run a SPARQL query and return it as a Pandas dataframe. When running this script within Power BI Desktop, the Python scripting service recognizes all unique dataframes created, and allows you to import the dataframe into Power BI as a table, which can then be used to create visualizations. Requirements: 1. You must have the AllegroGraph Python API installed. If you do not, installation instructions are here: https://franz.com/agraph/support/documentation/current/p ython/install.html 2. Python scripting must be enabled in Power BI Desktop. Instructions to do so are here: https://docs.microsoft.com/en-us/power-bi/connect-data/d esktop-python-scripts a) As mentioned in the article, pandas and matplotlib must be installed. This can be done with ‘pip install pandas’ and ‘pip install matplotlib’ in your terminal.
    [Show full text]
  • Arangodb Is Hailed As a Native Multi-Model Database by Its Developers
    ArangoDB i ArangoDB About the Tutorial Apparently, the world is becoming more and more connected. And at some point in the very near future, your kitchen bar may well be able to recommend your favorite brands of whiskey! This recommended information may come from retailers, or equally likely it can be suggested from friends on Social Networks; whatever it is, you will be able to see the benefits of using graph databases, if you like the recommendations. This tutorial explains the various aspects of ArangoDB which is a major contender in the landscape of graph databases. Starting with the basics of ArangoDB which focuses on the installation and basic concepts of ArangoDB, it gradually moves on to advanced topics such as CRUD operations and AQL. The last few chapters in this tutorial will help you understand how to deploy ArangoDB as a single instance and/or using Docker. Audience This tutorial is meant for those who are interested in learning ArangoDB as a Multimodel Database. Graph databases are spreading like wildfire across the industry and are making an impact on the development of new generation applications. So anyone who is interested in learning different aspects of ArangoDB, should go through this tutorial. Prerequisites Before proceeding with this tutorial, you should have the basic knowledge of Database, SQL, Graph Theory, and JavaScript. Copyright & Disclaimer Copyright 2018 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher.
    [Show full text]
  • Practical Semantic Web and Linked Data Applications
    Practical Semantic Web and Linked Data Applications Common Lisp Edition Uses the Free Editions of Franz Common Lisp and AllegroGraph Mark Watson Copyright 2010 Mark Watson. All rights reserved. This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works Version 3.0 United States License. November 3, 2010 Contents Preface xi 1. Getting started . xi 2. Portable Common Lisp Code Book Examples . xii 3. Using the Common Lisp ASDF Package Manager . xii 4. Information on the Companion Edition to this Book that Covers Java and JVM Languages . xiii 5. AllegroGraph . xiii 6. Software License for Example Code in this Book . xiv 1. Introduction 1 1.1. Who is this Book Written For? . 1 1.2. Why a PDF Copy of this Book is Available Free on the Web . 3 1.3. Book Software . 3 1.4. Why Graph Data Representations are Better than the Relational Database Model for Dealing with Rapidly Changing Data Requirements . 4 1.5. What if You Use Other Programming Languages Other Than Lisp? . 4 2. AllegroGraph Embedded Lisp Quick Start 7 2.1. Starting AllegroGraph . 7 2.2. Working with RDF Data Stores . 8 2.2.1. Creating Repositories . 9 2.2.2. AllegroGraph Lisp Reader Support for RDF . 10 2.2.3. Adding Triples . 10 2.2.4. Fetching Triples by ID . 11 2.2.5. Printing Triples . 11 2.2.6. Using Cursors to Iterate Through Query Results . 13 2.2.7. Saving Triple Stores to Disk as XML, N-Triples, and N3 . 14 2.3. AllegroGraph’s Extensions to RDF .
    [Show full text]
  • Practical Parallel Hypergraph Algorithms
    Practical Parallel Hypergraph Algorithms Julian Shun [email protected] MIT CSAIL Abstract v0 While there has been significant work on parallel graph pro- e0 cessing, there has been very surprisingly little work on high- v0 v1 v1 performance hypergraph processing. This paper presents a e collection of efficient parallel algorithms for hypergraph pro- 1 v2 cessing, including algorithms for computing hypertrees, hy- v v 2 3 e perpaths, betweenness centrality, maximal independent sets, 2 v k-core decomposition, connected components, PageRank, 3 and single-source shortest paths. For these problems, we ei- (a) Hypergraph (b) Bipartite representation ther provide new parallel algorithms or more efficient imple- mentations than prior work. Furthermore, our algorithms are Figure 1. An example hypergraph representing the groups theoretically-efficient in terms of work and depth. To imple- fv0;v1;v2g, fv1;v2;v3g, and fv0;v3g, and its bipartite repre- ment our algorithms, we extend the Ligra graph processing sentation. framework to support hypergraphs, and our implementations benefit from graph optimizations including switching between improved compared to using a graph representation. Unfor- sparse and dense traversals based on the frontier size, edge- tunately, there is been little research on parallel hypergraph aware parallelization, using buckets to prioritize processing processing. of vertices, and compression. Our experiments on a 72-core The main contribution of this paper is a suite of efficient machine and show that our algorithms obtain excellent paral- parallel hypergraph algorithms, including algorithms for hy- lel speedups, and are significantly faster than algorithms in pertrees, hyperpaths, betweenness centrality, maximal inde- existing hypergraph processing frameworks.
    [Show full text]
  • Property Graph Vs RDF Triple Store: a Comparison on Glycan Substructure Search
    RESEARCH ARTICLE Property Graph vs RDF Triple Store: A Comparison on Glycan Substructure Search Davide Alocci1,2, Julien Mariethoz1, Oliver Horlacher1,2, Jerven T. Bolleman3, Matthew P. Campbell4, Frederique Lisacek1,2* 1 Proteome Informatics Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland, 2 Computer Science Department, University of Geneva, Geneva, 1227, Switzerland, 3 Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva, 1211, Switzerland, 4 Department of Chemistry and Biomolecular Sciences, Macquarie University, Sydney, Australia * [email protected] Abstract Resource description framework (RDF) and Property Graph databases are emerging tech- nologies that are used for storing graph-structured data. We compare these technologies OPEN ACCESS through a molecular biology use case: glycan substructure search. Glycans are branched Citation: Alocci D, Mariethoz J, Horlacher O, tree-like molecules composed of building blocks linked together by chemical bonds. The Bolleman JT, Campbell MP, Lisacek F (2015) molecular structure of a glycan can be encoded into a direct acyclic graph where each node Property Graph vs RDF Triple Store: A Comparison on Glycan Substructure Search. PLoS ONE 10(12): represents a building block and each edge serves as a chemical linkage between two build- e0144578. doi:10.1371/journal.pone.0144578 ing blocks. In this context, Graph databases are possible software solutions for storing gly- Editor: Manuela Helmer-Citterich, University of can structures and Graph query languages, such as SPARQL and Cypher, can be used to Rome Tor Vergata, ITALY perform a substructure search. Glycan substructure searching is an important feature for Received: July 16, 2015 querying structure and experimental glycan databases and retrieving biologically meaning- ful data.
    [Show full text]
  • Preview Orientdb Tutorial
    OrientDB OrientDB About the Tutorial OrientDB is an Open Source NoSQL Database Management System, which contains the features of traditional DBMS along with the new features of both Document and Graph DBMS. It is written in Java and is amazingly fast. It can store 220,000 records per second on commodity hardware. In the following chapters of this tutorial, we will look closely at OrientDB, one of the best open-source, multi-model, next generation NoSQL product. Audience This tutorial is designed for software professionals who are willing to learn NoSQL Database in simple and easy steps. This tutorial will give a great understanding on OrientDB concepts. Prerequisites OrientDB is NoSQL Database technologies which deals with the Documents, Graphs and traditional database components, like Schema and relation. Thus it is better to have knowledge of SQL. Familiarity with NoSQL is an added advantage. Disclaimer & Copyright Copyright 2018 by Tutorials Point (I) Pvt. Ltd. All the content and graphics published in this e-book are the property of Tutorials Point (I) Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish any contents or a part of contents of this e-book in any manner without written consent of the publisher. We strive to update the contents of our website and tutorials as timely and as precisely as possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt. Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our website or its contents including this tutorial. If you discover any errors on our website or in this tutorial, please notify us at [email protected].
    [Show full text]
  • Application of Graph Databases for Static Code Analysis of Web-Applications
    Application of Graph Databases for Static Code Analysis of Web-Applications Daniil Sadyrin [0000-0001-5002-3639], Andrey Dergachev [0000-0002-1754-7120], Ivan Loginov [0000-0002-6254-6098], Iurii Korenkov [0000-0002-8948-2776], and Aglaya Ilina [0000-0003-1866-7914] ITMO University, Kronverkskiy prospekt, 49, St. Petersburg, 197101, Russia [email protected], [email protected], [email protected], [email protected], [email protected] Abstract. Graph databases offer a very flexible data model. We present the approach of static code analysis using graph databases. The main stage of the analysis algorithm is the construction of ASG (Abstract Source Graph), which represents relationships between AST (Abstract Syntax Tree) nodes. The ASG is saved to a graph database (like Neo4j) and queries to the database are made to get code properties for analysis. The approach is applied to detect and exploit Object Injection vulnerability in PHP web-applications. This vulnerability occurs when unsanitized user data enters PHP unserialize function. Successful exploitation of this vulnerability means building of “object chain”: a nested object, in the process of deserializing of it, a sequence of methods is being called leading to dangerous function call. In time of deserializing, some “magic” PHP methods (__wakeup or __destruct) are called on the object. To create the “object chain”, it’s necessary to analyze methods of classes declared in web-application, and find sequence of methods called from “magic” methods. The main idea of author’s approach is to save relationships between methods and functions in graph database and use queries to the database on Cypher language to find appropriate method calls.
    [Show full text]