A Framework for Composing Web Resources

Hua Xiao Bipin Upadhyaya, Ran Tang, Ying Zou Joanna Ng, Alex Lau School of Computing Dept. of Electrical and Computer Engineering IBM Toronto Lab Queen’s University Queen’s University Markham, Ontario, Canada Kingston, Ontario, Canada Kingston, Ontario, Canada {jwng, alexlau} [email protected] {9bu, ran.tang, [email protected]} @ca.ibm.com

To support the end-user’s social, professional,

Abstract recreational and other activities, it is essential to Large amount of heterogeneous Web resources, create a Web-based service composition frame- such as SOAP-based and RESTful work that can 1) discover all available Web re- service exist on the . It is labor-intensive sources to fulfill the end-user’s goal, despite of and inefficient for an end-user to search and com- their types; 2) represent the relation between dif- pose different Web resources in order to fulfill ferent resources to allow them to be used collabo- his/her requirement. To support the end-user’s ratively; 3) automatically compose required Web various activities, we propose a Web resources resources to fulfill the goal specified by the end- composition framework. This framework can user. It is challenging to realize such a service help end-user: 1) discover available Web re- composition framework. In particular, The Web sources to fulfill the end-user’s goal, despite of resources are described in heterogeneous formats. their types; 2) represent the relation between dif- For example, Web Service Description Language ferent resources to allow them to be used collabo- (WSDL) [2] is used to describe SOAP (Simple ratively; 3) automatically compose required Web Object Access Protocol) based Web Services that resources to fulfill the goal specified by the end- makes remote procedure calls. HTTP-based user. are simpler Web resources, implemented as a set of standard HTTP requests. Examples of HTTP- 1 Introduction based APIs are twitter, Flickr and various Yahoo Various types of Web resources, such as SOAP- APIs. The HTTP-based APIs can be described based Web Service and RESTful service, exist on using Web pages or WSDL 2.0. Informational the Web to provide various functionalities, such Websites are implemented by various technolo- as information access and online banking. There gies, such as , HTML and XML. Some of the are large amount of Web resources available on descriptions, such as HTTP-based APIs are not the Internet. Not all Web resources are highly machine readable. This hinders the ability to dis- relevant to a user’s requirement. Only a subset of cover the most suitable resource to satisfy the goal Web resources can fulfill an end-user’s require- of a particular end-user. ment. It is difficult for the Web end-users to sift In this paper, we propose a framework to help through the sheer volume of Web resources to end-users compose various Web resources. Our fulfill their goals. Without the aid from a service framework uses a unified description schema to composition tooling, an end-user has to manually describe the heterogeneous Web resources and a discover and compose different Web resources. resource graph model to represent the relations For example, a person planning a conference trip among different Web resources. Representati needs to locate the Web resources for transporta- tional State Transfer (REST) [9] is an architectur- tion, accommodation and other activities separate- al style for network-based systems. REST was not ly and integrate the results from these Web introduced as an approach to designing Web ser- resources. This is a time-consuming and tedious vices, yet the non-corporate Web Service com- process and may not produce the optimal outcome. munity as alternative to SOAP/WSDL has For example, the end-user may not be able to dis- adopted it. Although not always adhering to the cover the Web resource that provides the most all of REST’s constraints [10], RESTful Web economical air ticket. Services are gaining popularity and are adopted

1

Figure 1: Steps for composing Web resource by major service providers like Google, Amazon sources to work together toward some complex and Yahoo. RESTful service provides uniform goal. End-users should be able to compose and interface which is immutable (no problem of define the flow between the Web resources. RDF breaking clients). HTTP/POX is ubiquitous (goes is designed specifically for exchanging and inte- through firewalls). Since it adheres to the prin- grating Web data. In our framework, we wrap the ciple of Web, it naturally has proven scalability unified Web resources into RESTful services then with caching, clustered server farms for Quality of adopt Description Framework (RDF) [1] to de- Service (QoS). End user just need browser to get scribe RESTful services and their relations. While started, no need to buy WS-* middleware. More- we construct the resource graph, the unified re- over, we provide an approach to compose re- source descriptions are used to help us identify sources for end-users using the resource graph. resources and their relations. The composite resources are represented as ad- When an end-user wants to fulfill a goal, the hoc processes. An ad-hoc process contains a set of end-user simply describes the desired goal using tasks without strict execution order. keywords. We map the keywords into resources The remainder of this paper is presented as described by the resource graph. and infer an ad- follows. Section 2 gives an overview of the pro- hoc process from the resource graph to help the posed framework. The details of the unified de- end-user fulfill the goal. In the following sub- scription schema, the resource graph to model the sections, we discuss the details in unifying the relation among resources, and the technique to description of various types of resources, con- construct ad-hoc process using the resource graph structing a resource graph, and inferring ad-hoc are presented in section 2. Section 3 concludes processes from the resource graph. this paper. 2.1 Unifying Resource Descrip- 2 Overview of Framework tion Figure 1 provides an overview of our framework. To assist the automatic discovery of various Web To describe the heterogeneous Web resources, we resources, we propose a schema to uniformly de- collect heterogeneous Web resources from the scribe different types of Web resources. The uni- Internet and represent them in a unified resource fied representation provides a better chance to description schema. We identify the relations discover Web resources than limiting the service among different resources and construct a re- discovery within the Web resources of a single source graph. HTTP-based APIs are generally type. described using plain, unstructured HTML docu- As shown in Figure 2, we define two parts in ments which are only useful to human developers. the unified schema: the general description part Nowadays, using Web resources, such as finding and the operation description part. suitable services, composing services, mediating The general description of a Web resource pro- between different data formats, are mainly manual vides a bibliographic description about the Web tasks. To maximize the interoperability among the resource: the type, the name, the provider and the resources, we need a common data model to de- URI of the Web resource. Such descriptions are scribe the resources and their relations. There are common to all types of Web resources. There are two requirements for interoperability: (1) the suitable for representing the general resources themselves must be able to program- description. For example, using 15 text fields (e.g., matically interoperate. For example, they must be title, type and publisher), the metada- able to invoke one another and pass data among ta schema can describe various resources, e.g., themselves; (2) there must be a user interface me- books and Web pages [3]. We adopt the Dublin chanism for the user to orchestrate the Web re- Core format to represent the general description.

2

Among the four fields of general description, formal interface depend on the type of Web re- the type, the name, the provider fields are self- sources. To invoke an HTTP-based API operation descriptive. The URI field of the general descrip- and a Web form, the URI, the HTTP verb, and the tion refers to the URI that identifies the Web re- input parameters of the operation are required. source on the Internet. The URI of a SOAP-based The formal interface is described in an XML for- Web Service points to its WSDL file. For an mat which can be interpreted by a machine to HTTP-based API, the URI field is filled by the automatically invoke the operation. URI of its description Web page. To invoke a An excerpt is taken from the existing description particular functionality of the Web resource, of an operation. It provides more readable and another URI may be required because each opera- detailed information, such as examples and dem- tion of the Web resource may have a different onstrations. It also offers a shortcut for a SOA URI, which is described in operation description professional to understand the operation without part. having to search for the operation in the entire The operation description describes the document. The excerpt is available only for functionalities offered by a Web resource. A Web SOAP-based Web Services and HTTP-based resource can deliver one or more functionalities. APIs. An operation represents a primitive unit of func- tionality used to compose a service-oriented ap- plication. Different types of Web resources contain varied number of operations. Most SOAP- based Web Services and HTTP-based APIs pro- vide complex functionalities and contain multiple operations. To describe each operation, we use the tag- based description, the formal interface and the excerpt of existing description. The tag-based description uses a set of descriptive tags (i.e., Figure 2: A unified resource description scheme keywords) to informally represent an operation, including the functionality description, the input 2.2 Constructing Resource description and the output description. The in- Graph put/output descriptions help describe different A resource graph represents all resources using operations with the same name or similar functio- RDF model. It is a semantic network model, nality description. For example, two operations which consists of entities and relationships. Enti- are named as “displayOrders”. One operation with ties are identified globally with URIs. We want to the parameter “productID” is different from the other one with the parameter “custormerID”. The tag-based description can concisely convey the functionality of the operation to the resource con- sumers. In addition, it facilitates automatically compare the functionalities of operations in order to discover similar Web resources. The formal interface provides information to support the invocation of an operation. The formal interface is intended for machine consumption in Figure 3:Conceptual model for RESTful Services order to facilitate automatic invocation. The represent all the service in REST style. RESTful SOAP-based Web Service is originally described Service and RDF both represent “Resource,” with a formal interface. Hence, a client program which is the main motivation behind using RDF. can be automatically generated to invoke opera- RDF provides a common framework for express- tions in a SOAP-based Web Service. In contrast, ing information so it can be exchanged between other Web resources (e.g., the HTTP-based APIs) applications without loss of meaning. do not have a formal interface and the SOA pro- Figure 3 shows the conceptual model of the fessionals need to manually write the request to RESTful services. We model all SOAP based invoke the operation. The fields contained in the Web Services, HTTP-based API in terms of

3

RESTful services. As shown in figure 3, each initial node. In Figure 4, hotel is represented in service has one or more resource and service itself different color and it denotes the initial node. can be treated as resource. Each resource is uni- When a user requests a service this node is re- quely identifiable and can have one or more than turned and from that node, user can start using the one representation. The information about the service. service is stored in the meta-data and resources have links to other resources. The input and out- put message has one or more parameter which is defined by the schema. Resource may link to oth- er resources. We used link specification [7] to indicate the relationship between the resources. The “rel” attribute in the link specification give the information about the semantics of the link. In addition to types of “rel” defined by IANA [8], Figure 4: Resource and their relations we introduced few other types that helps to represent resource in RESTful services more easi- For example, if we want to book a room we ly. need information regarding the resource room, x see-also recommends another service Thus using REST approach it can be used as x same–as provides the similar services shown in following example: x is-a defines is-a relation between the resources As shown in Figure 5, when the resource room is x contains defines different service as in the case requested, the output contains information about of composite service. the next resource. The next state in this example x is-container-of defines the resource-to- can be one of the room types which are single container relationship. bedroom and double bedroom. The semantic rela- These relations help recommend services, iden- tion between these resources is give by “rel” tify similar services, and define the relationship attribute .After knowing the next resource user between the resources. Semantic relationship agent also needs to know which method to use. It helps to abstract the resource representation. Fig- is provided in method attribute in the link. ure 4 shows the different resource and the seman- GET /hotel tic and data-link relations between the different HOST: foo.org resources. Since double bedroom and single bed- Output room has is-a relationship with room all the data link and semantic relation from room is carried over to those two different categories of the room. In addition to that, we added “method” attribute in the link. Thus, the user agent can know next poss- ible resources to visit and along with method used to visit that resource. The information provided by the method attributes helps to define the flow. It Figure 5: Example of RESTful Service also tells about the data required by another re- WSDL/SOAP based services are lifted to re- source that end-user wants to visit. For example in source level by identifying the resource and the Figure 4, “review” resource requires the informa- HTTP method for each operation and then de- tion regarding the resource hotel. Hence, this type scribed in RESTful approach. When the end-user of relationship is called data-link relations. The uses these resources the corresponding WSDL small circle represents which method to can be operations have to be invoked which can be iden- invoked on resources from the current state. In tified using the mapping information for operation Figure 4 the user can invoke only GET method in and resources. Converting every service in REST the resource review from the hotel resource. This style helps the end users to see everything as re- substantially increases the user agents’ capability source and each resource is associated with the of discoverability of resource. In our resource corresponding four methods. End-users are ex- graph, the resource from where end-users can start posed to only the resource model of the service consuming service (a starting point) is defined as described in RDF. It is easier to deal with one

4 model than dealing with different kinds of hetero- former group contains the work item “buy flight geneous specification. Once the service is tickets” and the latter does not have. represented in the RDF form, it adds a lot of ex- pressiveness with query languages (e.g., SPARQL [4]), transformation languages (e.g., GRDDL [5]), and rule languages (e.g., RIF [6]).

2.3 Inferring Ad-hoc Processes from Resource Graphs Figure 6 illustrates the definition of ad-hoc processes. An ad-hoc process is characterized by a set of work items and sub ad-hoc processes per- formed by end-users to fulfill a goal. A work item Figure 6: Definition of ad-hoc process in the ad-hoc process is a set of tasks which are collaborated together to accomplish a transaction. 2.4 Inferring Work Item Rela- The work items in an ad-hoc process are con- tions from Resource Graphs nected through the relation defined by users or the The work items in an ad-hoc process can be con- semantic relations defined in the resource graph. nected together using different relations. The rela- A task is the combination of resource and opera- tions of work items in ad-hoc processes are shown tions. The resource in a task is defined in the re- as follows. source graph. The operation in a task processes x Group indicates that a set of work items should the resource. In RESTFul services, we can use be performed together. The group relation can operations Get, Post, Put and Delete. For example, be further described as “And”, “Or”, “Se- the task “search for flight ticket” could be de- quence” or “Parallel” relations. scribed as the resource “flight ticket” and the as- x And means that all the work items need to be sociated operation “get”. Eventually, a task can be performed. “And” relation does not provide the performed by one or several concrete Web servic- detailed information about the execution order. es. Web services can be represented as resources Thus the work items can be executed in any or- in the resource graph. Therefore, we can trace the der by default. When we have more information resource graph to find the associated Web servic- about the execution order of work items, we can es. use “Sequence” and “Parallel” relations to de- In our definition, work item is different from ad- scribe “And” relations with the execution order hoc process although they both contain tasks. The information of the work items. resources related to the tasks in a work item are x Sequence means that the work items need to be connected by data-link relations in the resource executed following an order. graph. The relations of tasks in a work item are very closely coupled. To fulfill a transaction, the x Parallel means that the work items can be ex- user needs to execute all the tasks linked by the ecuted at the same time. data. For example, the work item “buy flight tick- x Or indicates that the user only needs to execute ets” includes tasks “search ticket”, “choose ticket” one of the work items in the group. and “pay the bill”. In order to accomplish the goal x Ungroup releases the existing “group” relation. of “buy flight tickets”, the user has to perform all When a user does not like an existing group of the three tasks in the work item. On the contrary, work items, the user can use this relation to the relations of work items in an ad-hoc are loose- show that he/she does not agree to group these ly coupled. Different users can have different ad- work items together. hoc processes for the same goal. For instance, Our framework analyzes the relations in resources when planning a trip, some users may prefer tak- graph and infers the work item relations from the ing flight than driving car, but other users may resource graph. Table 1 shows the mapping from prefer driving instead of taking flight. The ad-hoc the resource graph to work item relations. process of planning a trip for these two groups of In Table 1, the “See_also” relation in resource users is different since the ad-hoc process for the graph can be use to recommend work items to users. The user has the option to perform it or

5 ignore it. The “Same_as” relation in the resource unified description schema and can be wrapped to graph indicates that two resources are equal. RESTful services. The resources in our frame- Therefore, we convert the “Same_as” relation into work have semantic relationship and data link “Or” relation of work items. The siblings of relations between them and can be described in “Is_a” relation in the resource graph are converted RDF. RDF also adds a lot of expressiveness with to “Or” relation since “Is_a” relation shows that query languages, transformation languages, and one resource is an instance of another and these rule languages bringing more participation from instances have the same features. The elements in end-users side. Thus we provide a framework a “Contains” relation in the resource graph is con- whereby Internet end-users can compose their verted to “And” relation in the ad-hoc process. own Internet space, which is defined as a collec- Figure 7 gives an example to illustrate the main tion of resources that can be used to feed, filter, idea of inferring work item relations from the compose, disseminate, and reference information, resource graph. In Figure 7, the resources “Hil- data, and services to end-users according their ton”, “Holiday_Inn”, “Flight”, and “Restaurant” profile, context, and mode of operation. By ana- are converted to work items in the ad-hoc process. lyzing the relations among services, we can infer In this example, if we trace entire resource graph, the ad-hoc processes to compose resources. The these work items can be implemented by several way of describing everything in terms of re- tasks which are associated with detailed resources. sources solves the integrating issues and drives the innovation in the end user side. Table 1: Infer work item relations from resource graphs References Relation in resource graph Work item relation [1] D. Beckett, B. McBride (editors), “RDF/XML Syntax Specification (Revised),” W3C Recommendation (2004) [2] R. Chinnici, J. Moreau, A. Ryman, S. Wee- rawarana, “Web Service Description Lan- guage,” W3C Recommendation (2007) [3] Dublin Core Initiative, http://dublincore.org/, last accessed on Octo- ber 12, 2010 [4] SPARQL Query Language for RDF, http://www.w3.org/TR/rdf-sparql-query/, last accessed on October 12, 2010 [5] Gleaning Resource Descriptions from Di- alects of Languages (GRDDL),

http://www.w3.org/2004/01/rdxh/spec, lat ac- cessed on October 12, 2010 [6] RIF In RDF, http://www.w3.org/TR/rif-in- rdf/, last accessed on October 12, 2010 [7] Web Linking, http://tools.ietf.org/html/draft- nottingham-http-link-header-10#section-4, last accessed on October 12, 2010 [8] Link Relations, http://www.iana.org/assignm ents/link-relations/link-relations.xhtml, last accessed on October 12, 2010 [9] R. Fielding. “Architectural Styles and The Design of Network-based Software Architec- Figure 7: An example of relation inference tures,” PhD thesis, University of California, 3 Conclusion Irvine (2000) [10] Vinoski, S.: RESTful Web Services Devel- This paper presents a framework to compose hete- opment Checklist. Internet Computing,IEEE rogeneous resources on the Web. In our frame- 12, 96–95 (2008) work, Web resources can be described by the

6