Leveraging Linked Data to Build Hypermedia-Driven Web Apis
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 7 Leveraging Linked Data to Build Hypermedia-Driven Web APIs Markus Lanthaler 7.1 Introduction The fact that the REST architectural style forms the fundament of the Web, the most successful distributed system of all time, should be evidence enough of its benefits in terms of scalability, maintainability, and evolvability. One of the fundamental principles of REST is the use of hypermedia to convey valid state transitions at run- time instead of agreeing on static contracts at design time. When building traditional Web sites, developers intuitively use hypermedia to guide visitors through their sites. They understand that no visitor is interested in reading documentation that tells them how to handcraft the URLs necessary to access the desired pages. Developers spend considerable time to ensure that the site is fully interlinked so that visitors are able to reach every single page in just a few clicks. To achieve that, links have to be labeled so that users are able to select the link bringing them one step closer to their goal. Often that means that multiple links with different labels but the same target are presented to make sure that a visitor finds the right path. This is most evident when looking at the checkout process of e-commerce sites which usually consists of a single path leading straight to the order confirmation page (plus a typically de- emphasized link back to the homepage or shopping cart). On this path, the user has to fill in a number of forms asking for order details such as the shipping address or the payment details. It is not a coincidence that these forms tend to use exactly the same language on completely different e-commerce sites. It is also not a coincidence that the same names for the form fields are chosen to allow the user’s browser to fill the fields automatically in or, at least, offer auto-completion. HTML5 tries to push that even further by introducing an autocomplete attribute along with a set of tokens in order to standardize the auto-completion support across browsers1. All this ∗ The bibliography that accompanies this chapter appears at the end of this volume and is also available as a free download as Back Matter on SpringerLink, with online reference linking. 1 http://www.whatwg.org/specs/web-apps/current-work/#autofilling-form-controls:-the-autocomp- lete-attribute M. Lanthaler () Institute for Information Systems and Computer Media, Graz University of Technology, Rechbauerstraße 12 8010 Graz, Graz, Austria e-mail: [email protected] C. Pautasso et al. (eds.), REST: Advanced Research Topics and Practical Applications, 107 DOI 10.1007/978-1-4614-9299-3_7, © Springer Science+Business Media New York 2014 108 M. Lanthaler is part of purposeful optimization with the clear goal to increase conversion rates, i.e., to ensure that visitors achieve their goal. It is therefore surprising to see that most of the time developers completely ignore hypermedia when using the Web for machine-to-machine interaction, or, more commonly speaking, Web APIs. Instead of using dynamic contracts that are retrieved and analyzed at runtime, which would, just as on the human Web, allow clients to adapt to ad-hoc changes, developers chose to use static contracts. All the knowledge about the API a server exposes is typically directly embedded into the clients. This leads to tightly cou- pled systems which impede the independent evolution of its components. When a service’s domain application protocol [179], which defines the set of legal inter- actions necessary to achieve a specific, application-dependent goal, is defined in a static, non-machine-readable document served out-of-band, it becomes impossible to dynamically communicate changes to clients. Even though such approaches might work in the short term, they are condemned to break in the long term as assumptions about server resources will break as resources evolve over time. Another problematic aspect in the development of WebAPIs is the proliferation of custom data formats instead of building systems on top of standardized media types. This makes it impossible to write generic clients and inhibits serendipitous reuse. As Steve Vinoski argues in his excellent article [243], platforms, although efficient for their target use case, often inhibit reuse and adaptation by creating highly specialized interfaces; even if they stick to industry standards. He argues “the more specific a service interface [is], the less likely it is to be reused, serendipitously or otherwise, because the likelihood that an interface will fit what a client application requires shrinks as the interface’s specificity increases.” This observation surely applies to most current WebAPIs which are, due to their specializations, rarely flexible enough to be used in unanticipated ways. This chapter is an attempt to address the issues outlined above. We will start off by exploring why hypermedia is not used for machine-to-machine communication and what the current best practices for developing Web APIs are. After introducing Linked Data and JSON-LD, we will present a lightweight vocabulary able to express concepts needed in most Web APIs. Along with a short list of design guidelines, we will finally show how easily all those techniques can be integrated in current Web frameworks. Based on a simple prototype we will then demonstrate how truly RESTful services that are accessible by a generic client can be build in considerably less time by using these technologies. Last but not least, we will give an outlook of how the proposed approach opens the door to other Semantic Web technologies that have been built in more than ten years of research. 7.2 Hypermedia-Driven Web APIs: Challenges and Best Practices As outlined in the introduction, developers instinctively use hypermedia when build- ing traditional Web sites but seem to ignore it completely when building Web APIs. One of the reasons behind this might be the different level of tooling support. Since, 7 Leveraging Linked Data to Build Hypermedia-Driven Web APIs 109 in contrast to the human Web, no generic clients exist, developers usually not only need to develop the server side part, but also a client (library), which is then used by other developers to access the WebAPI. It is not surprising that these two components are often tightly coupled, given that they are usually developed in lockstep by the same team or at least the same company. One of the reasons for this is certainly that forWeb APIs no accepted, standardized media type with hypermedia support exists. JSON, which is much easier to parse and has a direct in-memory representation in most programming languages, is typically favored instead of using HTML as on the human Web. Unfortunately, this often leads to the exposure of internals resulting in a tight coupling between the server and its clients. A common example for this is the inclusion of local, internal identifiers in representations instead of including links to other entities. This requires out-of-band knowledge of IRI (Internationalized Resource Identifier) templates to reconstruct the URLs to retrieve representations of entities referenced in such a way. Since, in most cases, the documentation about those IRI templates is not machine-readable, they are hardcoded into clients which consequently means that clients break whenever the server implementation changes. The current best practice for developing truly RESTful JSON-based Web APIs is to define a custom media type which extends JSON to support labeled hyperlinks. Effectively this means that HTML’s anchor or link tags with their relation attributes (rel) are imitated by some JSON structure. Since there is a common need for such functionality, there have already been some efforts to standardize such extensions to JSON such as, e.g., Collection+JSON [13], but so far their adoption is very limited. Far more often these proprietary extensions are documented out-of-band on the API publisher’s homepage. Apart from the description of how hyperlinks are expressed, these documentations generally also include a list of resource types (such as products and orders) describing their semantics, properties, and their serializations. Last but not least, a number of link relations along with the supported HTTP operations, the expected inputs and outputs, and the consequences of invoking those operations are documented. This allows developers to get an overview of the API’s service surface and to implement specialized clients. Clients for such APIs are usually small libraries to parse retrieved representations, serialize data to be transmitted to the Web API, and to invoke HTTP operations. Not uncommonly, clients for object-oriented programming languages also include classes to represent the various resource types as native objects. While simple libraries are usually statically bound to the server’s URL space, more sophisticated clients embrace hypermedia to eliminate this tight coupling. This allows them to browse through the server’s information space and, to a certain degree, to dynamically adapt to changes in interaction flows. Typically such client libraries support a static set of resource types and are able to recognize a set of link relations which allows them to find the paths to achieve specific goals. In a sense, it could be said that those libraries understand the representations they retrieve from the API, but effectively this understanding is very limited. These clients do nothing more than interpreting representations by looking for specific tokens that are used to trigger certain operations. Since the semantics of those tokens are very weak, no further algorithmic automation is facilitated. 110 M. Lanthaler The problem with the practices outlined above is that they result in specialized im- plementations targeting specific use cases and not generalizations that can be reused across application domains. Therefore, every API created with such an approach is unique and needs to be documented.