Masaryk University Faculty of Informatics

Using GraphQL for Content Delivery in Kentico Cloud

Bachelor’s Thesis

David Čechák

Brno, Fall 2017

Masaryk University Faculty of Informatics

Using GraphQL for Content Delivery in Kentico Cloud

Bachelor’s Thesis

David Čechák

Brno, Fall 2017

This is where a copy of the official signed thesis assignment and a copy ofthe Statement of an Author is located in the printed version of the document.

Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

David Čechák

Advisor: Bruno Rossi, PhD

i

Acknowledgement

I would like to thank my advisor Ph.. Bruno Rossi for helpful and proactive suggestions, to Kentico company for the opportu- nity work on this thesis with their guidance, especially to Jiří Kusák for advice regarding the practical part of the thesis. I also owe my gratitude to my family and friends for their support.

iii Abstract

The goal of the thesis is to research GraphQL technology as a possible alternative to Kentico Cloud’s Delivery API. The current API is based on RESTful principles. The thesis is divided into four parts. In the beginning is explained the current state of Delivery API and how Kentico Cloud’s data are structured. The second part compares REST and GraphQL concepts and explores possibilities of incorporating GraphQL API. It considers and approaches that could han- dle both. Next chapter describes implementation of proof of concept solution, which is based on the previous analysis. The last chapter breaks down conducted measurement and presents its results.

iv Keywords

GraphQL, content delivery, JavaScript, Node.js, Kentico Cloud Deliv- ery, cloud-first headless CMS

v

Contents

1 Introduction 1

2 How Kentico Cloud delivers content 3 2.1 CMS introduction ...... 3 2.1.1 Managed cloud hosting ...... 3 2.1.2 Platform as a Service ...... 3 2.1.3 Software as a Service ...... 3 2.1.4 Headless CMS ...... 4 2.2 Kentico Cloud ...... 4 2.3 Data description ...... 5 2.4 Current storage ...... 6 2.5 data structure ...... 7 2.5.1 Content item ...... 7 2.5.2 Content type and taxonomy ...... 9 2.6 Delivery API ...... 10

3 Analysis 13 3.1 REST ...... 13 3.1.1 Six constraints ...... 13 3.1.2 Limitations ...... 14 3.2 GraphQL ...... 14 3.2.1 Schema ...... 15 3.2.2 Introspection ...... 16 3.2.3 Operations ...... 16 3.2.4 Adjustability ...... 16 3.2.5 GraphiQL ...... 17 3.3 GraphQL and REST comparison ...... 17 3.3.1 Get the entire entity ...... 18 3.3.2 Versioning ...... 18 3.3.3 Endpoints architecture ...... 19 3.4 Possible database models ...... 20 3.4.1 NoSQL ...... 20 3.4.2 Graph-based ...... 21 3.4.3 Multi-model database ...... 21 3.5 Azure Cosmos DB ...... 22

vii 3.5.1 DocumentDB ...... 22 3.5.2 GraphDB ...... 23 3.5.3 DocumentDB API and to query GraphDB 23 3.6 Selected solution ...... 24

4 Implementation 25 4.1 Description of used technologies ...... 25 4.1.1 Javascript with Node.js ...... 25 4.1.2 Package manager npm ...... 25 4.2 How to use the application ...... 26 4.2.1 Prerequisites ...... 26 4.2.2 Startup instructions ...... 26 4.2.3 Inline fragments ...... 27 4.3 Drawbacks ...... 28

5 Measurement 31 5.1 Selection of measurement metrics and test samples ..... 31 5.2 Results ...... 32 5.2.1 Batching queries ...... 33

6 Conclusion 37 6.1 Summarization of the work accomplished ...... 37 6.2 Possible further improvements and steps required for a suc- cessful integration ...... 37

Bibliography 39

A Attached content 45

viii List of Tables

5.1 Delivery API dataset details. 32 5.2 GraphQL - whole item dataset details. 33 5.3 GraphQL - partial item dataset details. 34 5.4 Test results (columns in ms stand for response time). 34

ix

List of Figures

2.1 Top level fields of a content item. 7 2.2 Fields of "elements" field part 1. 8 2.3 Fields of "elements" field part 2. 9 3.1 An example GraphQL query. 17 4.1 A query sent to GraphQL server using Postman application. 27 4.2 A query with inline fragments. 29 4.3 A response to the query with inline fragments. 30 5.1 GraphQL query in disjunctive form. 35

xi

1 Introduction

Software development methodologies, as well as software architec- tures are continually evolving. In the modern software development environment emerges a need for more flexible and customer oriented approach. Agile methodologies take over the software development world at the expense of older methods, for example, the waterfall model. Fast pivoting of a product based on new discoveries made during development is necessary to keep up with competition. Another factor is that the Internet is expanding into a wide variety of devices like smartphones, tablets, Internet of Things (IoT) devices and more. And the data flows through diverse network infrastructures. Mobile networks are common and with them a need to minimize the users consumption of data, for financial and performance reasons. In light of these facts an idea for GraphQL emerged. Especially mo- bile applications significantly suffer from a decreasing performance with an increasing complexity. GraphQL development started in Face- book as a solution to increasing complexity of their mobile application and other issues mentioned above [1]. Kentico Cloud is a CMS system. It provides users with its RESTful Delivery API. This thesis aims to examine GraphQL as an alternative to it. Its goal is to find out what adjustments would be necessary for a successful integration of GraphQL. In Chapter 3 is an analysis of possible solutions and approaches to this endeavor with a decision, how a proof of concept solution should be implemented. Chapter 4 breaks down concepts used in implementation and puts together a basic guide explaining how to use it. This solution is tested in last chapter. There is a list of metrics used for the experiment and an explanation why these metrics were selected. Kentico Cloud is used by other applications as a source of con- tent. It is a layer upon which these applications are dependent. Their performance is highly influenced by it. Therefore it is important to optimize it as much as possible.

1

2 How Kentico Cloud delivers content

2.1 CMS introduction

A content management system(CMS) is a computer application for developers, marketers, content creators and other people involved in a production of websites, online stores, intranets or community sites. It makes it easier for them to develop it, manage the workflow of content publication and manage user roles and their access levels during the process [2, 3]. As this thesis revolves around Cloud CMS, in the next section are itemized ways how to move CMS functionality into the cloud.

2.1.1 Managed cloud hosting Managed cloud hosting is a CMS installed in the cloud mostly man- aged manually by the provider. It has the same needs. To make the product more appealing, vendor can take care of all the upgrades, hotfixes, security, backup and other issues. This is called managed cloud hosting. It lacks flexibility as typically the user has to contact the vendor and ask for deploying new changes to their production environment [4].

2.1.2 Platform as a Service These problems can be solved using highly automated environments that with self-service configuration and deployment. It is called Plat- form as a Service (PaaS). Unfortunately, this approach also has its weaknesses. Users do not have full control over the hosting environ- ment. In addition, they still have to test if their website is not broken after every upgrade [4].

2.1.3 Software as a Service Kentico Cloud CMS is an application built as Software as a Service(SaaS). As such it has some noticeable benefits over the models mentioned be- fore. The application runs on a server and it is completely maintained by the providing company. Users access it through a web browser.

3 2. How Kentico Cloud delivers content

They do not have to be worried about updating their CMS applica- tion as with classic CMS installed on their computers. It is also called cloud-first1. Chiefly it runs as a multi-tenant SaaS service meaning all the users run the same version of the application and the provider manages only one standardized environment. This makes the job for the provider a lot easier [4].

2.1.4 Headless CMS With traditional rigid CMS users often have to rewrite their code into a format defined by CMS. But modifying the code requires users to invest additional time and the application template they have to follow restricts them. It can be solved by using headless CMS(API-first CMS). This architecture cuts off the presentation layer, leaving the front-end part on the user and only provides the content for it through its application programming interface (API). Basically, it is an API to retrieve, work with and display data to populate a website or mobile application with. The API makes the content available on any of the various devices that are used nowadays (smartphones, virtual reality, IoT). Users can write their website or mobile application using any programming language and they can use their own development process. They only have to exchange their static code for the dynamic data obtained using the API. Thus using the CMS does not require to change the structure of their code as it happens in other non-headless models. Other advantages are security because users have to take care of security of their code only, scalability and a fact that application lifecycle is not influenced by the CMS. Also, the website can handle traffic smoothly because the cloud service does the most work [4].

2.2 Kentico Cloud

Kentico Cloud is a headless CMS made as Software-as-a-Service. It is mainly for digital agencies and their clients. As such it allows a team to collaborate on production of structured content in the cloud. For certain Content items there is a Content type defined, which gives

1. The software was built for the cloud from the start.

4 2. How Kentico Cloud delivers content

structure to the items. Workflow is used to move items through a workflow steps to provide easier management of tasks that haveto be done before they are published. It is possible to comment items during the content creation for a better communication with others. It also supports versioning [5]. As a headless CMS it provides Delivery API for developers to expose the created content on websites, mobile applications or other devices. There are many supported languages and SDKs [6]. Kentico Cloud also offers a way to engage in tracking customer on clients websites. It enables to collect information like what pages the users visited or which forms they filled. Based on this it put together a complete history of activities for every individual user. Then this knowledge can be used for personalization of website experience. Also it can be exported to other systems [5].

2.3 Data description

There are many components that user can utilize to manage their content in Kentico Cloud, but only some of them are retrievable by the Delivery API. A project is a structure containing all the content elements that belong to the same venture. It is divided into Content inventory, Content models and Workflow. In both Content inventory and Content models users influence how the data content and struc- ture look like. Content models are content types, taxonomy groups and terms, and sitemap locations. Content types structure is defined by smaller units called content type elements. These can be self describing with a single value: Text, Number, Date and time, URL slug; or they can be more complex: Asset, Modular content, Multiple choice, Rich text or Taxonomy. URL slug’s value is automatically derived from the name of a content item and can be changed manually. There can be only one in a content type. If the content item’s type does not use a URL slug element the value is an empty string. Asset serve as a placeholder for uploading files. They can be for example images or documents. Asset then holds the necessary information about the files. Modular contents are references to modular items, which are a part of this item or connected with it. They are an array of their code names. Taxonomy elements are tags

5 2. How Kentico Cloud delivers content helping the user to label items and classify the content by similarities as themes or target audience. Rich text is appropriate for longer text with user defined formating. It is possible to use headings, paragraphs, lists, tables, strong text and text with emphasis. The text can contain images, links and modular content. Multiple choice is a list of options to chose from. Either represented by radio buttons then only single choice can be selected or by check boxes then more items can be selected [7, 8]. Content inventory is a place to create and modify content items, which form depends on models defined in content types. The shape of every group of content items is determined by a content type they belong to [7]. Workflow section serves for creating and managing workflow steps. By labeling the content items with these steps user can refine the list of content items located in the Content inventory [9].

2.4 Current storage

For the purpose of storing data Kentico Cloud uses Azure Cosmos DB, more precisely a subpart of it, the DocumentBD. DocumentDB is a Document-based database. In a Document-oriented storage a stored unit is an independent object with an ID and everything related encapsulated together with it. This provides better performance as data is continuously read from the disk. On the other hand, to retrieve or update a value of a record it is necessary to get the whole record, which negatively affects the performance. What is known as a table in a relational database, is here referred to as collection and records within a single collection can have different structures. It depends on the concrete implementation of the database, but generally, the item is a semi-structured data, mostly in JSON or XML format. There are only a few restrictions on the stored data, which means basically any structure can form a document and that makes it a very flexible choice. The document may consist of any variety of nested types and substructures (e.g. a document inside a document). Migrations are facilitated by the absence of a database hierarchy or schema. This schema-less approach brings a lot of freedom for a developer. It can ease the job as document maps directly to objects in modern object-oriented languages, but it also creates a space for more

6 2. How Kentico Cloud delivers content

mistakes as the can accidentally (or intentionally) insert not matching data [10, 11, 12].

2.5 Database data structure

2.5.1 Content item Content item as a JSON object consists of group of attributes. Every content item received from the Delivery API includes a system field with basic information about the retrieved content item. User defined properties and at the same time properties that are necessary to be deliver to the user in a response to his or her API request are Content item data and modular content. Content item data are composed of two objects. First is a System object. Its structure is demonstrated in an example of top level fields of a content item in Figure 2.1. Id is a unique identifier of the content item itself. Codename is generated from its display name. Attribute last_modified is formatted in ISO-8601 date/time [13].

Figure 2.1: Top level fields of a content item.

Second is a Element object. It is a collection of elements which all share three general fields. Typedefines its category and can contain any of types seen in Data Description 2.3. Name is self explanatory. Value

7 2. How Kentico Cloud delivers content depends on the type of the element. An example with the structure of all what "elements" object can consist of is in Figure 2.2 and Figure 2.3. The split into two images does not represent any differences between the two groups. It is only for better orientation it the text.

Figure 2.2: Fields of "elements" field part 1.

Modular content carries a list of codenames that belongs to content items that are used in Rich text and Modular content elements [14]. Fields whose names start with underscore are not user defined data. They are system generated properties important for inner functions of DocumentDB. Nevertheless some of them can be useful for the customers of the database. For example in the value of property _self is a relative URI of the resource. It can be used for a future simpler access to this resource [15]. Then there are another fields not included in elements or system objects. The id, search_metadata and dependencies are important only

8 2. How Kentico Cloud delivers content

Figure 2.3: Fields of "elements" field part 2. for Kentico internal processes. Id of the record is composed project id, item id and language id connected with semicolons, where item id itself is to be found in system object.

2.5.2 Content type and taxonomy Same as content item, every content type has its system object with necessary information about it and elements object. Although system field is slightly different in content type. Here it only has id,name, codename and last_modified attributes. They behave the same asitis described for content item. There are also differences in the elements object. Every unit of elements object has only its name and type, there are no value fields or similar properties. However there are two ex- ceptions. The taxonomy type, which has its taxonomy_group and the multiple_choice type has an array of options. Apart from this, there is a new field object_type with value "ContentType".

9 2. How Kentico Cloud delivers content

Taxonomy group as both models described before contains a sys- tem object. This object has same structure as in content type. Unlike the other two models, taxonomy has no elements field. Instead it has a col- lection of Term objects. A term has a list of taxonomically descendant terms.

2.6 Delivery API

Through Delivery API users can retrieve the content from the projects they are managing in the Kentico Cloud. It is a read-only API that returns data in JSON and is built with REST principles in mind. It is important to examine the API for the comparison with GraphQL alternative in following chapters [16]. For the users to fetch any items of their project they have to know the project ID. Contents of the project retrievable by Delivery API are content items, content types and taxonomies. All of these are individu- ally identifiable by its codenames. It is possible to filter multiple objects using more sophisticated queries. For example, one can specify code- names of content type, sitemap or taxonomy. All of these identifiers can be found in the Kentico Cloud application [17]. There are various filtering options available. It is possible to de- fine system attributes and content elements as filter parameters for a query. All of the system fields except language may be used. It is necessary to specify them as system.. The same method applies for the elements, thus the query argument is given as elements.. There is also a possibility to connect more query parameters by ampersand character. It narrows the filter- ing even more, because this join merges the parameters with logical conjunction. Therefore only records fulfilling all of the restrictions are returned [18, 19]. For filter parameters comparison user can work with generally known operators. The system supports equality operators: equal, lower than, greater than, lower than or equal, greater than or equal and inclusive range of values. It also supports "in" operator, which takes an array of values as an argument and determines if a value of processed field is in the specified array of values. In addition there are array filters and these are available on fields: sitemap locations in

10 2. How Kentico Cloud delivers content system, and multiple choice and taxonomy in the modular content. Operator "contains" filters out every array that does not include given attribute. The next two are similar, except they accept an array as ar- gument. An object passes filter with "any" operator if at least oneof the arguments given is inside its array field. Operator "all" works the same way, with a difference that all of the arguments should be inthe array field [18]. Ordering can be used on the same fields that can be targeted by filtering. That means user can sort by attributes of system and elements objects. The ordering takes as arguments a field to be used for sorting and a descending or ascending modifier [20].

11

3 Analysis

3.1 REST

Representational State Transfer is an architectural style for distributed hypermedia systems which imposes certain constraints on web ser- vices and its properties [21]. It is based on and built around HyperText Transfer Protocol (HTTP), Uniform Resource Identifier (URI) and Hy- perText Markup Language(HTML). It started as a model of how the web applications should work [22] , introduced by Roy Thomas Field- ing in his dissertation published in 2000. If the restrictions of the RESTful design are put in use altogether then the benefits in the following areas. It increases the scalability of component interactions and allows for components in- dependent deployment reduce interaction latency. Interfaces become more general. It enforces security. It encapsulates legacy systems. It is widely used in today’s world of web services. The next part summarizes its main principles. Resources are identified by URIs and there can be more URIs pointing to the same resource. In addition it is possible to have more representations of the resource and different shapes of data being returned based on the resource and items in the request. Representations are exchanged between a client and a server and they are typically JSON or XML, but others are possible. When a client holds a representation it has enough data to modify the resource. HTTP verbs are used to express client’s intentions and manipulate data. They are essentially GET, PUT, POST and DELETE [21].

3.1.1 Six constraints Uniform interface constraint demands using URIs as resource names and HTTP verbs as actions to be taken the resources, receiving an HTTP response informing whether it was successful and/or returning data. Statelessness states that every message has enough context to be self-descriptive and for the server to process it. Thus the meaning of each message does not depend on the state of the conversation

13 3. Analysis between a client and a server. The server should not contain any of the clients state. If there is any state, it is held on the client side. Any information should be cacheable. It is client-server. A client has to assume that there will not always be a direct connection to the database or server. Uniform interface is a link between them. A client knows only the methods to communicate and the repre- sentation of the response. It has to be aware that talking to a RESTful service, it is talking to a layered system. This means there could be intermediaries involved in the communication of the client, the server and the database in a shape of software or hardware. Their purpose could be to relieve workload from the database by caching etc. An optional constraint Code on Demand means that logic (exe- cutable code) could be transferred from the server to the client [21].

3.1.2 Limitations REST only describes the software engineering principles of how a server should expose data to its clients. It is difficult to verify actually obey these rules. Therefore, many APIs using a resource- based approach and not meeting all REST suggestions could be falsely labeled as RESTful [23, 24, 25]. With an emerge of agile technologies, RESTful APIs are not able to keep up with the flexibility clients require. Other REST limitations are evident in comparison REST and GraphQL comparison in Figure 3.3

3.2 GraphQL

A data for clients to query application servers and a runtime to fulfill those queries with data. It is an open-source project developed by and publicly released in 2015. The initiative was started due to the insufficiency of the RESTful approach to de- livering resources and a need to think about data in different repre- sentations. During a development of mobile applications, a demand for lighter and more flexible data querying emerged. It provides the client an ability to specify in a declarative manner only the data it

14 3. Analysis

needs. For example, an array of article names and its publication dates. In response it receives the data it asked for and nothing more. The format used for the request is similar to JSON without values [26]. GraphQL is also useful in transferring workload from client to server. It supports nested arguments for every field and embedded sub-objects. Beneficial for an association of multiple API fetches. Itis even possible to pass arguments to scalar fields. Through that client can influence the representation in which it gets the data back orasks the server to execute a certain transformation on the data. For example various date formats [27]. GraphQL is a specification for server-side run time environment for executing data queries. It is not and architectural style imposing any constraints like REST. It is not a programming language, but a declarative language to query application servers and a language for these servers to define their behavior in processing those queries with. Accordingly these application servers are not obligate to use any specific storage system, programming language, exposed data format, neither any technology at all. Therefore, it can be used with any library or on any platform [28]. As all the data GraphQL transfers in queries and responses are kept in a payload it does not enforce use of a particular transport protocol for the communication between client and server. Accordingly it can work over Hypertext Transfer Protocol (HTTP), which is a most common choice, but also protocols like User Datagram Protocol (UDP) or Transmission Control Protocol (TCP) [29]. When operating over HTTP, GraphQL should handle POST and GET requests. Entities are not identified by URLs as in REST. Therefore, a server usually provides a single URL (endpoint) [28]. As for this thesis are important only some of the operations GraphQL is capable of, this section focuses on explaining only the relevant syn- tax.

3.2.1 Schema Every GraphQL API defines a schema. That is basically a set of types, where type is an object with all its properties that are available to be queried. It is in a shape of a JSON without the field values. The schema then can be queried with a request that contains in its payload

15 3. Analysis an object that consists of only the fields also specified in the schema. In other words it can be the same object or a sub-object.

3.2.2 Introspection A valuable thing about schema is its introspection system. Introspect- ing over a schema basically means to ask the system what queries can be performed. It an alternative for a schema documentation. Through the introspection a user can browse all the object types that are de- fined and accessible on the GraphQL API. Their description andhow they are structured. The type description is a documentation given in the API’s code. Also enum values, type interfaces and more can be introspected. It is accessible by querying fields __schema and __type from the root of a query [28].

3.2.3 Operations To understand GraphQL operations it is first necessary to go through some basic terms. An operations can be a query, a mutation or a subscription. As Delivery API is read-only, queries are sufficient for the purpose of this thesis. Queries are used by a client to acquire data from a GraphQL server. A query follows the structure of a schema. It starts with specifying a type. The type has a selection set, which is a set of fields. These fields are basically a group of another types. The selection set is what makes GraphQL recursive and represents the data required from the server. A field may have set of arguments attached to it. Arguments are in a shape of key-value couples. The structure that selection set can branch into is defined by the schema. In other words, a query is composed as a form of sub-object of the schema. The fields defined in the query assemble a subset ofthe schema [30]. In Figure 3.1 is an example of a simple GraphQL query.

3.2.4 Adjustability There are a lot of possibilities for the API developer to implement additional more complex constructions and in this way modify the GraphQL as one desires. As GraphQL is an open-source project there are many discussions among developers about what feature should

16 3. Analysis

Figure 3.1: An example GraphQL query.

and should not be included. Some community demands lead to up- dates of GraphQL projects [31, 32]. However, administrators are con- servative about changes. Therefore some supplementary features are completed by the users themselves in a standalone npm packages [33].

3.2.5 GraphiQL An in-browser IDE for GraphQL called GraphiQL. It is highly interac- tive and ideal for exploring [34] GraphQL capabilities and developing new features. It is provided from the express- module.

3.3 GraphQL and REST comparison

GraphQL and REST are different not only in how the systems im- plemented according to their rules behave and what features they emphasize. The terms are difficult to compare, because they describe concepts not only different in characteristics, but concepts of com- pletely distinct kinds. GraphQL is a specification of a technology, whereas REST is an architectural style. Therefore, this section contains a look at the comparison problem from different angles and various comparison approaches are presented. Some of the differences are

17 3. Analysis already explained in the description of the terms in Section 3.1 and Section 3.2.

3.3.1 Get the entire entity

The advantage of GraphQL is the possibility of specifying directly the fields required. However, from another point of view this could also bring some difficulties. If client requires whole entity from aREST API it is considered a basic request and there is commonly a dedicated endpoint for that. Nevertheless, if the client queries a GraphQL API and wants it to fulfill the same request, there is no other option then to specify all of its fields. This leads to a tedious work in cases where the object is enormous in amount of fields.

3.3.2 Versioning

In REST it is common to have groups of developers dependent on dif- ferent API versions. For example, the version number can be specified in an URL or HTTP header. Versioning is necessary for every change that could be possibly a breaking change. And with every breaking change, a new version has to be introduced. Both, REST and GraphQL could be versioned in this way. However, for GraphQL this is unnecessary for new features as the client specifies what data it is receiving. It is sufficient toonlyadda new queryable field and keep the old fields the same. Or a new format of a field can be specified by its argument. Therefore users whodo not need the new capabilities may not even notice. A field the would be deleted in a new version is marked as@de- pricated. Then there is the possibility to track the frequency of use through the resolve function of the concrete type or field and it can be eventually deleted once there is no one using it anymore. Accordingly, a GraphQL server will essentially represent the same version and remain in the same address. Although REST also offers the possibility of deprecating fields and introducing new ones, GraphQL makes it easier for developers [35, 36, 37].

18 3. Analysis

3.3.3 Endpoints architecture

Defining the term of over-fetching and under-fetching. Both are prob- lems of getting a different amount of information than needed. They are one of most common complications with REST. They occur be- cause API endpoints serve fixed data structure not always matching the client’s request. Over-fetching happens when in the response the server sends additional data, which the client did not ask for and will not use. It uses more bandwidth than would be needed for the requested data. Adding unnecessary load to the user’s network. Under-fetching is when one API call to get required information about an entity is not sufficient. Leading the client to make more calls to receive all the data. Thus more potentially avoidable HTTP requests have to be sent. Both REST and GraphQL commonly work over HTTP. REST APIs expose an arrangement of URLs, one for every resource. GraphQL exposes its data dissimilarly, with a single endpoint for all of the resources and server capabilities. To describe this in an example. If a need to display titles of a few most recent articles writen by a specific user followed by top comment for each and his or her name would arise in an appplication. A typical REST solution would be to query user endpoint to get the article IDs, then fetch the articles in a query to another endpoint and finaly get the comments for every article. For every query, the server response would typically consist of all the information about a particular entity, because of responses fixed data structure. But at this point, the client only needs one field from each. This way there would be redundant information delivered to the client application for it to process. It would over-fetch. A possible solution would be to adjust the API to expose the data in a more convenient way for this certain case. But the form of data the front-end needs may change quickly during iteration cycles of application evolution. Especially in today’s environment of agile soft- ware development or while experimenting with various features. That would required back-end to adapt every time this happens. A GraphQL API usually has only single endpoint used for every data query. All the required data are specified in the body of a single

19 3. Analysis

HTTP request. When the server gets this request, it is processed and a database, another server, or combinations of these is queried to get the necessary data. It then goes through a resolve function situated on every queryable field to be shaped into final form. After that, itis served to the user in a body of a response. GraphQL is efficient for agile approach to application development. There is no demand on the backend to adjust endpoints or create new ones whenever it would be beneficial for the frontend. This allows swift iterations of the frontend. Therefore, it is possible to make faster development cycles and less of wasteful work. Backend developers do not have to build specialized endpoints or compromise on perfor- mance by generalizing and merging more individual endpoints into fewer comprehensive ones. Frontend developers have more flexibility. Less coordination needed between both groups and they also save time on discussing these compromises [23, 35, 37].

3.4 Possible database models

This section describes database models that were considered as a pos- sible alternatives for the solution. It takes a look at models that could be used and would be convenient when combined with GraphQL. Section’s content is narrowed to only those models, which, before examination, appeared to be in line with company’s interests and are well usable with GraphQL.

3.4.1 NoSQL One of the definitions of NoSQL is a non-SQL database and is defined as a storage where data is modeled differently than in classical rela- tional databases and other ways than the SQL language are used for querying the data. Though in some cases the NoSQL term is used as "not only SQL" for a database which supports the SQL. These alter- native options offer a simple replication, high availability, horizontal scaling, and new query methods [38, 39, 40, 12, 41]. The "not only SQL" is more appropriate definition for this thesis as Azure DocumentDB supports querying by SQL, although being document-based and thus marked as NoSQL.

20 3. Analysis

There are various types of NoSQL databases, but for thesis purpose, only two are interesting. The first is Document-based, because itis the storage type currently in use for the Delivery API of Kentico. It is described in Section 2.4. The second model is Graph-based, a possible alternative explained in Subsection 3.4.2.

3.4.2 Graph-based Units of a are vertices (nodes) connected with edges. Both are retrievable by its IDs. They also carry an arbitrary number of properties holding additional information. Vertices represent objects such as a user, an article or a user role. Edges connect a network of relationships between these objects. For example, a user-editor wrote an article and another user-supervisor approves its publication. The true efficiency of graph storages comes to the forefront when modeling and traversing of large and complex connections between entities are the core of the application [12].

3.4.3 Multi-model database Various companies describe their product as multi-model meaning something slightly different (e.g. Microsoft Cosmos DB [42], ArangoDB [43], OrientDB [44]). Therefore it could be said that there are more defini- tions of a multi-model database. They vary in what kind of querying is supported and how the data is modeled in the storage. In this work, the term is used for a combination of several data stores in a single solution."A native multi-model database has one core, one query language, but multiple data models." [43] For example, Cosmos DB allows the user to choose from an API for his use case during the database creation which determines in what model will the data be persisted. If for instance table storage is selected, there cannot be any document inserted into it. Therefore, in the end, it is not a multi-model by the definition above. Multi-model architecture is useful for large software projects which need the benefits of more types of storages. As using more separated databases leads to data inconsistency, duplication issues, more compli- cated deployment, complex storage management and more frequent upgrades [43, 42, 44].

21 3. Analysis

Storages ArangoDB and OrientDB are multi-model databases. They are both not conceived as a globally distributed database service as CosmosDB, but instead a storage solution to be installed and run by the user. With ArangoDB or OrientDB there is the possibility of having database hosted in a cloud [45], nevertheless it is not a Software as a Service. Cosmos DB, on the other hand, is a SaaS and this characteristic is highly valuable to Kentico. GraphQL API is only to be tried as an alternative for the company to decide what follow-up steps to take next in this domain. Therefore to switch to a service so different would be unreasonable.

3.5 Azure Cosmos DB

Cosmos DB is a product combined of different storage models and APIs to query them. It is not a multi-model database by the definition mentioned in Section 3.4.3, the reasons being, for example, that it is not possible to include graph structures once the document model is chosen. It is simply because the SQL API provided for DocumentDB does not support it. This section further considers Cosmos DB’s suit- ability for the proof of concept solution and mainly for the future of incorporating GraphQL to Kentico.

3.5.1 DocumentDB

Kentico Cloud uses DocumentDB as current data storage, more about it is in Section 2.3. Therefore the most straightforward solution is to keep this database and create an alternative API using the GraphQL but querying the database through DocumentDB API same as the current Deliver API does. An important decision factor is that the company inclines to Mi- crosoft technologies from its start. Almost all data is stored in Azure. Therefore choosing Azure as managed database service provider en- sures certain ease of transition, future usage, and management.

22 3. Analysis

3.5.2 GraphDB

GraphDB is a graph-based (Section 3.4.2) data storage model from Cosmos DB family. It supports querying by Gremlin language. Gremlin is a graph traversal machine and language developed by Apache Software Foundation. Gremlin language is a functional language implemented in the user’s native programming language. Gremlin machine consists of a graph, traversal and a set of traversers. Gremlin language is used to define the traversal. Traversal is composed of instructions defining the passage of traversers through the graph. Result is a final destination of traversers on the graph [46]. As Kentico Cloud’s data are stored in the DocumentDB, it would be necessary to migrate them to the GraphDB. A Cosmos DB data migra- tion tool exists, but it does not support the import to GraphDB yet [47]. Migration from another graph database would be more straightfor- ward [48]. Importing data from document to graph storage needs a considerable amount of adjustments. Consequently a modification of the data model would be necessary. As mentioned in an article "Hacking: accessing a graph in Cosmos DB with SQL/DocumentDB API" [49] it could be useful to use DocumentDB API to migrate the date. The thesis goal emphasizes GraphQL as an alternative and sug- gests finding out if the current storage is sufficient. Therefore, thisis rather a backup option to be used if the current storage would not be sufficient.

3.5.3 DocumentDB API and Gremlin to query GraphDB

Although, officially there is one API for each storage model inCosmos DB, there is a way how to query GraphDB using not only the Gremlin language, but also DocumentDB (SQL) API. It is important to notify that this method is not officially supported by the Cosmos DB. The knowledge of this behavior comes from experimental observation. Vertices in GraphDB are similar to documents in DocumentDB. They share the same name notations for the database metadata (fields starting with underscore). Each document’s custom field mapped into vertex’s field have to be shaped into an array of objects, each object contains _value and id. The reason is, that each field of a vertex or can edge can have multiple values.

23 3. Analysis

Edges are identified by _isEdge property. Additional properties _sink, _sinkLabel and _vertexId, _vertexLabel identify the source and target vertices of an edge. Last major difference, custom properties are simple key-value fields, unlike the vertices’ properties. By working with mentioned fields records can be get or created using the DocumentDB API as well as the GraphDB API [49, 50]. However, as was already mentioned, this behavior is not docu- mented by the Azure; thus, unexpected problems could occur now or in the future. Therefore, it is not an ideal approach to be used in a production, as there is no certainty in future behavior.

3.6 Selected solution

It is possible to implement a GraphQL server utilizing the Docu- mentDB in its existing condition. Considering the assignment speci- fication and a discussion with consultant from Kentico, the current storage is sufficient. Migrating to GraphDB would be unnecessarily complex for a proof of concept solution.

24 4 Implementation

An important part of the thesis was to implement a proof of concept solution. It is an exemplary GraphQL server application based on the analysis in Chapter 3. As the application is an alternative to the Delivery API, it also supports only basic read-only API calls. In the GraphQL terms, it implements GraphQL queries. This chapter consists of a description of technologies used in the solution and a how to use guideline.

4.1 Description of used technologies

4.1.1 Javascript with Node.js For the implementation was used Javascript (JS) with Node.js. The reason being is that JS is one of two languages widely used in Kentico, thus it will be easily understood by most developers. Kentico also uses Node.js for developing the Kentico Cloud application, both, front-end, and back-end. Node.js is an asynchronous event-driven JS runtime environment for running server-side applications.

4.1.2 Package manager npm A JS package manager called npm is used to install share and manage dependencies [51]. A package contains all files needed to use a module. A module is a library that can be included in JS project. There are several external modules used in the solution. All of the packages with their description can be found on the npm website [51]. The first is called express. It is a framework for Node.js providing basic features for development of web and mobile applications. Next is graphql. It contains all necessary type definitions available in GraphQL, which are used to build a schema. There are also functions to define a custom object and scalar types. Additionally, it provides features for serving queries against the schema. Third is an express-graphql, which is a GraphQL HTTP server that supports connection of middlewares. Is it used for setting up GraphiQL.

25 4. Implementation

Fourth package is a Node.js software development kit (SDK) for Azure’s DocumentDB API called documentdb. Fifth is memoizee, which is a simple memoization library to help with data caching. The solution also uses Babel for transpiling ECMAScript (ES6) syntax into ES5. That means compiling the source code into a language with a similar level of abstraction.

4.2 How to use the application

This section list necessary prerequisites, and describes how to set up and start the solution. It also shows an exemplary usage. The solution itself can be found in GitHub repository "graphql-content- delivery" [52] or in the attachments. The connection string to the DocumentDB is a confidential infor- mation. Therefore, the code is structured in a way that an uri and primaryKey, which are necessary to connect to the database, are ex- tracted into a file named configSettings.js. This file should be inserted into src folder, which is located in the graphql-content-delivery folder. It is not possible to run the application without this file. Therefore, the configSettings.js was shared with advisor and opponent. It can be shared with others on request.

4.2.1 Prerequisites Before running the solution a user has to install version 8.9.0 of Node.js with npm in version 5.5.1. It can be downloaded all at once from Node.js website.

4.2.2 Startup instructions After successful installation of Node.js with npm, it is possible to use the package manager to download all the necessary libraries used in the solution. Then use the Babel to transpile the source code and finally start the GraphQL application.

Instructions to run:

26 4. Implementation

1. open command line in the root folder (the one with file "pack- age.json") 2. run command npm install 3. run command npm run start Now the solution is running on a local host, port 4000. It can be tested either using GraphiQL or an application for sending HTTP requests. GraphiQL can be opened in browser by accessing the url http://localhost:4000/graphql. To query the application a user have to specify an operation as de- scribed in Subsection 3.2.3. A shortcut Ctrl + Space is useful to show possible fields. Using an application to send HTTP request, the query has to be defined in the body of a POST request. Its header also has tocontain Content-Type: application/graphql. For example, Postman [53] applica- tion can be used as shown in Figure 4.1

Figure 4.1: A query sent to GraphQL server using Postman application.

4.2.3 Inline fragments In the implementation unions are used to distinguish among element types. While querying a union type a user has to use inline fragments.

27 4. Implementation

For example, a query for articles published after a date specified in the arguments ordered from the newest to oldest asking for names and publish dates in Figure 4.2. An inline fragment in Figure 4.2 is a DateTimeElement. It is one of the Element types. The inline fragment here is useful to ensure that user is able to ask for a specific field type and follow-up subfields and the recommenda- tion function is able to suggest fields of the type [54].

4.3 Drawbacks

A structure of a group of content items is described by their content type. There is a consequence of this approach to data modeling. Unless the knowledge about the type is attained the structure of items’ ele- ments field is unknown. This field’s format is JSON with preliminarily unknown keys. Therefore, the developer of the client application is not able to use introspection to explore and ask for a concrete subfield. This problem had to be solved with a transformation of the JSON into an array of objects. For wholeness, each object contains its key.

28 4. Implementation

Figure 4.2: A query with inline fragments.

29 4. Implementation

Figure 4.3: A response to the query with inline fragments.

30 5 Measurement

The Delivery API and GraphQL proof of concept solution that was completed during this thesis are measured in performance, amount of data transfered. Caching is disabled for the test purpose. The amount of data transfered is measured with Postman [53] ap- plication. It is an application useful to facilitate API development and testing. The amount of data stays the same for a particular sample dur- ing the . Therefore it is sufficient to measure data amount of just one request. It is one of the principles of GraphQL to transfer only data that were requested by the user. This is an area where REST and GraphQL could differ greatly and it can bring interesting results. Performance is measured with Apache JMeter [55] application, which is an application for load testing and measuring performance.

5.1 Selection of measurement metrics and test samples

For the testing were chosen 5 sample queries often sent to Delivery API. They target content item endpoint of the API and it returns the whole entity/entities and if "depth" is specified, then the whole entities referenced in "modular_content". Only content items were selected. They are the most varied and allows for a demonstration of all capabilities. To run additional tests with content types or taxonomies would be just a repetition without any significant value. All of the requests ask for type "article". To compare the data amount differences there are two test ap- proaches for the GraphQL. In the first test the whole item is requested. The second considers a case where user would like to display only a title, a publish date and needs a codename for another processing. Therefore user requests exclusively these fields from GraphQL server. However, because of the way how the proof of concept is implemented, all text elements have to be requested in order to get the title.

31 5. Measurement 5.2 Results

Three test sample groups were used during the tests. For each group 5 test samples mentioned in previous section 5.1 were used. Each sample query was used 100 times to compose an iteration of 500 samples in total for each group. The first and second group involved requests common for Delivery API users. It is one group for Delivery API tests and one for GraphQL. The third group was also targeting GraphQL API and included the same arguments. Nonetheless, it requested only codename, text element and date element. Which, as resulted from the experiment, is on average 14.13% of the payload returned from the Delivery API.

Query sample Response data amount (KB) codename 8.05 type, date greater or equal 20.10 type, taxonomy tag, custom sort 34.84 order type, url slug, depth 2 15.84 url slug, sitemap 14.08 average 18,58

Table 5.1: Delivery API dataset details.

It may seem that GraphQL is considerably better because of its better results of request for partial item tests. But it is important to remember a few things. The measurement of the response time is rather tentative, because Delivery API code is more complex then the proof of concept solution. GraphQL brings new ways how clients can query an API. Even though these aspects are powerful and interesting, it is important to consider if they would actually bring value for the clients of Kentico Cloud. As it is with a lot of new features, not everything ends up with a sufficient customer acceptance. A lot of innovations mayseem valuable at the first glance, but end up not being used by the customers. Therefore, I would consider further examination of GraphQL and mainly an evaluation of customer interest.

32 5. Measurement

Query sample Response Percentage data of Deliv- amount ery API (KB) response (%) codename 5.92 73.54 type, date greater or equal 16.46 81.89 type, taxonomy tag, custom sort order 29.19 83.78 type, url slug, depth 2 12.43 78.47 url slug, sitemap 12.01 85.30 average 15,20 80.60

Table 5.2: GraphQL - whole item dataset details.

5.2.1 Batching queries

An issue with the five queries used in this measurement is that itin a sense favors REST. Delivery API has a support for queries with a conjunction of arguments (using AND), but there is nothing similar to specifying query arguments in a disjunction manner (using OR). An example of OR query is in Figure 5.1. This is in fact an important feature of GraphQL. Batching queries into one API call. That queries may ask for various types of resources or for the same resource multiple times, every time with a different arguments. The reason this is not included in the Delivery API most used queries is that users do not have the possibility to use it with the current variant of API. However it is not necessary to perform an experiment to measure this, as this fact is already known. It is only important to mention this advantage and describe what it means. For example, user requests three items each specified by a codename. To get this content from Delivery API it would be necessary to send 3 API calls. Using GraphQL it can be composed into one API call. Summarized, whenever user wants to send various queries to the API while specifying different arguments for each of them, the way it is done is different for REST and GraphQL. It depends on the structure of REST endpoints, but

33 5. Measurement

Query sample Response data Percentage of Deliv- amount (KB) ery API response (%) codename 1.54 19.13 type, date greater or 2.92 14.53 equal type, taxonomy tag, 4.40 12.63 custom sort order type, url slug, depth 2.04 12.88 2 url slug, sitemap 1.62 11.51 average 2.50 14.13

Table 5.3: GraphQL - partial item dataset details.

Test group min max average median total (ms) (ms) (ms) (ms) data (KB) Delivery API 129 303 167 147 9291 GraphQL - whole 88 426 132 107 7601 GraphQL - partial 86 395 126 104 1252

Table 5.4: Test results (columns in ms stand for response time). usually ratio one query to one API call is inevitable. By all means, the server can be modified to handle batched queries. However, as mentioned, REST is not flexible enough to cover every variant that emerges during development.

34 5. Measurement

Figure 5.1: GraphQL query in disjunctive form.

35

6 Conclusion

During this thesis was developed a proof of concept solution of GraphQL server application. It demonstrated the possibility of using GraphQL technology as an alternative to the current Delivery API, which is based on RESTful architectural style. Prior to this achievement, an analysis of Delivery API and possible usage of GraphQL was conducted. Various databases were consid- ered. Based on the acquired knowledge a decision was made to use DocumentDB.

6.1 Summarization of the work accomplished

The same data storage (DocumentDB) as is used for Delivery API was used for the GraphQL. No adjustments to the database were made. However, the data representation was modify. It concerns the field elements. In the Delivery API it is returned as an object where each element is accessible under its key value. The GraphQL API it returns as an array of objects, where each object has an additional field key. The reasons being, that this solution is more straightforward in GraphQL. However, this is only a manner of implementation, it can be adjust to adopt Delivery API representation. I accomplished to implement the basic functionality, which was assigned. Moreover, the proof of concept solution is also able to satisfy requests, which involve arguments connected with OR (disjunction). That is not possible with Delivery API. During the measurement was confirmed that GraphQL is able to significantly reduce the amount of network data flow. It wasalso discovered that both APIs should be able to perform with a similar performance.

6.2 Possible further improvements and steps required for a successful integration

I managed to fulfill the assignment of this thesis. However, its objective was to test if a GraphQL alternative is possible. Therefore, there are

37 6. Conclusion more steps to be taken to potential full acceptance of GraphQL. A follow-up work could be to extend the proof of concept solution to a full-fledged server application. An application that would handle the same requests as Delivery API and also build on the GraphQL benefits. Then a database supporting the Gremlin language could be con- sidered to help the GraphQL to its full potential.

38 Bibliography

1. GraphQL: A data query language [online]. Lee Byron, 2015 [visited on 2017-11-05]. Available from: https://code.facebook.com/posts/ 1691455094417024/graphql-a-data-query-language/. 2. What is a Content Management System (CMS)?: Content Management System (CMS) and other spin-off terms definition(s) [online]. Bernard Kohan, 2010 [visited on 2017-05-02]. Available from: http://www. comentum.com/what-is-cms-content-management-system.html. 3. Content Management System (CMS) [online]. Techopedia Inc. [visited on 2017-05-02]. Available from: https://www.techopedia.com/ definition/24075/content-management-system-cms. 4. What Is a Cloud-first Headless CMS? [online]. Petr Palas, 2016 [visited on 2017-05-02]. Available from: https://kenticocloud.com/blog/ what-is-headless-cms. 5. Kentico Cloud—the cloud-first CMS for digital agencies and their clients [online]. Petr Palas, 2016 [visited on 2017-11-08]. Available from: https://kenticocloud.com/blog/kentico-cloud-the-cloud- first-cms. 6. Delivering content [online]. Kentico software s..o. [visited on 2017-11-08]. Available from: https://developer.kenticocloud.com/v1/docs/ delivering-content. 7. Content structure [online]. Kentico software s.r.o. [visited on 2017-11-17]. Available from: https://developer.kenticocloud.com/v1/docs/ content-structure. 8. Content type elements reference [online]. Juraj Uhlar [visited on 2017-12-09]. Available from: https://help.kenticocloud.com/define-content- structure/content-elements/content-type-elements-reference. 9. Customize your content workflow [online]. Juraj Uhlar [visited on 2017-12-09]. Available from: https://help.kenticocloud.com/manage-projects- and-teams/workflow/customize-your-content-workflow. 10. Document Databases [online]. MongoDB, Inc [visited on 2017-11-22]. Available from: https://www.mongodb.com/document-databases.

39 BIBLIOGRAPHY

11. NoSQL Document Storage Benefits and Drawbacks [online]. Nick Kolakowski, 2017 [visited on 2017-11-25]. Available from: https://insights. dice . com / 2012 / 06 / 04 / - document - storage - benefits - drawbacks/. 12. REDMOND, E.; WILSON, J. Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement. 1st ed. Raleigh, NC: The Pragmatic , LLC., 2012. ISBN 978-1934356920. 13. System object (content item) [online]. Kentico software s.r.o. [visited on 2017-11-25]. Available from: https://developer.kenticocloud. com/reference#content-item-object. 14. Content item model [online]. Kentico software s.r.o. [visited on 2017-11-25]. Available from: https://developer.kenticocloud.com/reference# content-item-object. 15. Azure Cosmos DB hierarchical resource model and core concepts [online]. Rafat Sarosh; Mimi Gentz, 2017 [visited on 2017-11-25]. Available from: https : / / docs . microsoft . com / en - us / azure / cosmos - db/documentdb-resources. 16. Content item model [online]. Kentico software s.r.o. [visited on 2017-11-25]. Available from: https://developer.kenticocloud.com/reference# -introduction. 17. Using the Delivery API [online]. Kentico software s.r.o. [visited on 2017-11-25]. Available from: https://developer.kenticocloud. com/v1/docs/using-delivery-api. 18. Filtering [online]. Kentico software s.r.o. [visited on 2017-11-25]. Avail- able from: https://developer.kenticocloud.com/reference. 19. List content items [online]. Kentico software s.r.o. [visited on 2017-11-30]. Available from: https://developer.kenticocloud.com/reference# list-content-items. 20. Ordering [online]. Kentico software s.r.o. [visited on 2017-11-30]. Avail- able from: https://developer.kenticocloud.com/reference# content-ordering.

40 BIBLIOGRAPHY

21. FIELDING, Roy T. Architectural Styles and the Design of Network-based Software Architectures [online]. Irvine, 2000 [visited on 2017-12-01]. Available from: http://jpkc.fudan.edu.cn/picture/article/ 216 / 35 / 4b / 22598d594e3d93239700ce79bce1 / 7ed3ec2a - 03c2 - 49cb-8bf8-5a90ea42f523.pdf. PhD thesis. University of Califor- nia. 22. SEVERANCE, Charles. Roy T. Fielding: Understanding the REST Style. Computer. 2015, vol. 48, no. 6, pp. 7–9. 23. GraphQL is the better REST [online]. Ben Murden [visited on 2017-11-21]. Available from: https : / / www . howtographql . com / basics / 1 - graphql-is-the-better-rest/. 24. RESTful APIs, the big lie [online]. Michael S. Mikowski, 2015 [visited on 2017-12-00]. Available from: https://mmikowski.github.io/ the_lie/. 25. What Are The Drawbacks Of REST? [online]. Mark Little, 2013 [visited on 2017-11-21]. Available from: https://www.infoq.com/news/ 2013/05/rest-drawbacks. 26. Introduction to GraphQL [online]. Facebook Inc. [visited on 2017-11-02]. Available from: http://graphql.org/learn/. 27. Arguments [online]. Facebook Inc. [visited on 2017-11-02]. Available from: http://graphql.org/learn/queries/#arguments. 28. GraphQL [online]. Facebook Inc., 2016 [visited on 2017-11-02]. Avail- able from: http://facebook.github.io/graphql/October2016/. 29. EIZINGER, Thomas. API Design in Distributed Systems: A Comparison between GraphQL and REST [online]. Wien, 2017 [visited on 2017-12-01]. Available from: http://eizinger.io/assets/Master-Thesis.pdf. Master’s thesis. University of Applied Sciences Technikum. 30. The Anatomy of a GraphQL Query [online]. Sashko Stubailo, 2017 [vis- ited on 2017-11-02]. Available from: https://dev-blog.apollodata. com/the-anatomy-of-a-graphql-query-6dffa9e9e747. 31. [RFC] Ignore undefined input object fields [online]. Lee Byron, 2016 [vis- ited on 2017-11-02]. Available from: https://github.com/facebook/ graphql/issues/235.

41 BIBLIOGRAPHY

32. Less strict validation of input objects [online]. Glan Thomas, 2016 [visited on 2017-11-02]. Available from: https://github.com/graphql/ graphql-js/issues/303. 33. GraphQL Union Input Type [online]. Sergei Petrov, 2016 [visited on 2017-11-03]. Available from: https://github.com/Cardinal90/ graphql-union-input-type. 34. GraphiQL [online]. Facebook Inc. [visited on 2017-11-03]. Available from: https://github.com/graphql/graphiql. 35. GraphQL vs REST: Overview [online]. Phil Sturgeon, 2017 [visited on 2017-11-09]. Available from: https://philsturgeon.uk/api/2017/ 01/24/graphql-vs-rest-overview/. 36. Versioning an API in GraphQL vs. REST [online]. Jani Tarvainen, 2016 [visited on 2017-11-09]. Available from: https : / / symfony . fi / entry/versioning-an-api-in-graphql-vs-rest. 37. GraphQL Best Practices [online]. Facebook Inc. [visited on 2017-11-09]. Available from: http://graphql.org/learn/best-practices/. 38. A Comparison Of NoSQL Database Management Systems And Models [on- line]. O.S. Tezer, 2014 [visited on 2017-11-09]. Available from: https: //www.digitalocean.com/community/tutorials/a-comparison- of-nosql-database-management-systems-and-models. 39. NoSQL Databases Explained [online]. MongoDB, Inc. [visited on 2017-11-09]. Available from: https://www.mongodb.com/nosql-explained. 40. Top 5 Considerations When Evaluating NoSQL Databases [online]. Mon- goDB, Inc., 2016 [visited on 2017-11-09]. Available from: https : // webassets. mongodb. com/ _com_assets /collateral /10gen _ Top_5_NoSQL_Considerations.pdf. 41. NosqlDefinition [online]. Martin Fowler, 2012 [visited on 2017-11-09]. Available from: https://martinfowler.com/bliki/NosqlDefinition. html. 42. Welcome to Azure Cosmos DB [online]. Microsoft Corporation, 2017 [visited on 2017-11-09]. Available from: https://docs.microsoft. com/en-us/azure/cosmos-db/introduction. 43. ArangoDB [online]. ArangoDB [visited on 2017-11-09]. Available from: https://www.arangodb.com/.

42 BIBLIOGRAPHY

44. Multi-Model Database [online]. OrientDB LTD [visited on 2017-11-09]. Available from: http://orientdb.com/multi-model_database/. 45. Deployment [online]. ArangoDB [visited on 2017-11-09]. Available from: https://docs.arangodb.com/3.2/Manual/Deployment/. 46. RODRIGUEZ, Marko A. The Gremlin Graph Traversal Machine and Language. ACM Proceedings of the 15th Symposium on Database Pro- gramming Languages. 2015, pp. 1–10. 47. Azure Cosmos DB: Data migration tool [online]. Microsoft Corporation, 2017 [visited on 2017-10-17]. Available from: https://docs.microsoft. com/en-us/azure/cosmos-db/import-data. 48. Introduction to Azure Cosmos DB: Graph API [online]. Microsoft Corpo- ration, 2017 [visited on 2017-10-17]. Available from: https://docs. microsoft.com/en-us/azure/cosmos-db/graph-introduction. 49. Hacking: accessing a graph in Cosmos DB with SQL / DocumentDB API [online]. Vincent-Philippe Lauzon, 2017 [visited on 2017-10-17]. Available from: https : / / vincentlauzon . com / 2017 / 09 / 05 / hacking-accessing-a-graph-in-cosmos-db-with--documentdb- api/. 50. Hacking: changing Cosmos DB Portal experience from Graph to SQL [on- line]. Vincent-Philippe Lauzon, 2017 [visited on 2017-10-17]. Avail- able from: https://vincentlauzon.com/2017/09/10/hacking- changing-cosmos-db-portal-experience-from-graph-to-sql/. 51. What is npm? [online]. npm, Inc. [visited on 2017-12-16]. Available from: https://www.npmjs.com/. 52. [online]. David Čechák, 2017 [visited on 2017-12-17]. Available from: https://github.com/davidcechak/graphql-content-delivery. 53. Developing APIs is hard, Postman makes it easy [online]. Postdot Tech- nologies, Inc. [visited on 2017-12-09]. Available from: https://www. getpostman.com/. 54. Inline Fragments [online]. Facebook Inc. [visited on 2017-11-09]. Avail- able from: http://graphql.org/learn/queries/#inline-fragments. 55. Apache JMeter [online]. Apache JMeter [visited on 2017-12-09]. Avail- able from: http://jmeter.apache.org/.

43

A Attached content

The proof of concept solution is published as an attachment of this thesis in zipped file graphql-content-delivery. A guideline to run the solution is in Section 4.2.

45