Entity-Based Message Bus for Saas Integration
Total Page:16
File Type:pdf, Size:1020Kb
Entity-based Message Bus for SaaS Integration Yu-Jen John Sun Supervisor: Prof. Boualem Benatallah School of Computer Science and Engineering University of New South Wales Sydney, Australia A thesis submitted in fulfillment of the requirements for the degree of Masters of Science Sep 2015 Acknowledgments First of all, I would like to thank my supervisor Prof. Boualem Benatallah for this guidance and encouragement during my research. Without his support this work would not have been possible. I thank my co-authors, Dr. Moshe Chai Barukh and Dr. Amin Beheshti for their enjoyable collaborations. I would like to thank Moshe especially for his help in reviewing and editing our publishes. In addition, I would like to thank my fellow research colleagues for their kind encouragement and Denis Weerasiri, for giving me a lot of advices throughout the years. My final thanks go to my family for supporting me in pursuing an ad- vanced degree. I am especially grateful to my sister for her encouragement during my study in UNSW. ii Abstract The rising popularity of SaaS allows individuals and enterprises to leverage various services (e.g. Dropbox, Github, GDrive and Yammer) for everyday processes. Concequently, an enormous amount of Application Programming Interfaces (APIs) were generated since the demand of cloud services, allow- ing third-party developers to integrate these services into their processes. However, the explosion of APIs and the heterogeneous interfaces makes the discovery and integration of Web services a complex technical issue. More- over, these disparate services do not in general communicate with each other, rather used in an ad-hoc manner with little or no customizable process sup- port. This inevitably leads to “shadow processes”, often only informally man- aged by e-mail or the like. We propose a framework to simplify the integration of disparate services and effectively build customized processes. We propose a platform for man- aging API-related knowledge and a declarative language and model for com- posing APIs. The implementation of the proposed framework includes an Knowledge Graph for APIs called APIBase and an agile services integration platform, called: CaseWalls. We provide a knowledge-based event-bus for unified interactions between disparate services, while allowing process par- ticipants to interact and collaborate on relevant cases. iii Publications Sun, Y.J., Barukh, M.C., Benatallah, B., Beheshti S.M.R.: Scalable SaaS- based Process Customization with Case Walls, 13th International Conference on Service Oriented Computing (ICSOC 2015). iv Contents 1 Introduction 2 1.1 Background . .2 1.2 Motivation and Problems . .3 1.3 Contributions . .6 1.3.1 Knowledge Graph for APIs . .6 1.3.2 Declarative Language for Composing Integrated Pro- cess over APIs . .6 1.3.3 Event-based Process Management Platform . .7 1.4 Thesis Organization . .7 2 State of the Art 8 2.1 Introduction . .8 2.2 Interactions with an API . 12 2.2.1 API Design Methodology. 12 2.2.2 API Documentation. 16 2.2.3 API Testing. 17 2.2.4 API Programming Knowledge Re-Use. 18 2.3 Interactions between different API . 19 2.3.1 SOA & Microservices. 19 2.3.2 Process Automation. 21 v 2.3.3 Social BPM. 23 3 APIBase 25 3.1 Introduction . 25 3.2 API Knowledge Graph . 27 3.3 Architecture and Implementation . 31 3.3.1 Architecture Overview . 31 3.3.2 Graph Database Service . 33 3.3.3 APIBase Service . 36 3.3.4 Example . 41 3.4 Evaluation . 46 3.4.1 Experiment Setup . 47 3.4.2 Experiment Session . 47 3.4.3 Questionnaire . 49 3.4.4 Participant Groups . 50 3.4.5 Results . 50 3.5 Related Work . 52 3.5.1 API Documentation . 53 3.5.2 API Management . 54 3.5.3 Web-Services Repositories. 54 3.6 Conclusion . 55 4 Declarative Language for Composing Integrated Process over APIs 56 4.1 Introduction . 56 4.2 Knowledge-Reuse-driven and Declarative Case Definition Lan- guage . 59 4.2.1 Knowledge-Reuse Language . 59 vi 4.2.2 Declarative Case Definition Language . 60 4.2.3 Declarative Case Manipulation Language . 61 4.2.4 Illustrative Example . 63 4.3 Implementation . 65 4.3.1 Architecture . 65 4.3.2 Event Bus . 67 4.3.3 Case Orchestration Rules . 71 4.4 Evaluation . 73 4.5 Related Work . 76 4.6 Conclusion . 78 5 Conclusion and Future Work 80 5.1 Concluding Remarks . 80 5.2 Future Directions . 82 Bibliography . 82 1 Chapter 1 Introduction This chapter is organized as follows. In Section 1.1, we introduce the basic background. In Section 1.2, we outline the problem that we are addressing and discuss the motivation. In Section 1.3, we summarize our contributions. In Section 1.4 provides the organization of this thesis. 1.1 Background Traditional structured process-based systems increasingly prove too rigid amidst today’s fast-paced and knowledge-intensive environments. A large portion of processes, commonly described as “unstructured” or “semi-structured” processes, cannot be pre-planned and likely to be dependent upon the inter- pretation of human workers during process execution. On the other hand, there has been a plethora of tools and services (e.g., Web/mobile apps) to support workers with specific everyday tasks and enhanced collaboration. Software-as-a- Service (SaaS) is amongst the forefront of this technology. For instance, tools (henceforth referred to as “services”), such as: (i) Dropbox to store and share files online; (ii) Pivotal tracker to manage tasks and projects; 2 and (iii) Google Drive to edit and collaborate on spreadsheets. Workers of- ten need to access, analyze, as well as integrate data from various such cloud data services. At the same time, most services expose APIs (Application Programming Interfaces). APIs serve as the glue of online services and their interactions; bearing far-reaching ramifications. Social media already depend heavily on APIs, as do cloud services and open data sets. The spread of the Inter- net to ordinary devices (i.e. Internet of Things) will be facilitated through APIs. Much of the information we receive about the world will therefore be API-regulated. More specifically, the need to integrate user productivity services (i.e, SaaS applications, CRM tools, document and task management tools, together with social media frameworks) is vital - there are numerous pressing use-cases both in the enterprise and the consumer arena. However, while advances in APIs, Service Oriented Architectures (SOA) and Business Process Management (BPM) enable tremendous automation opportunities, new productivity challenges have also emerged: Most organizations still do not have the knowledge, skills, or expertise-at-hand to craft successful SaaS- enabled process customization strategies to take full advantage of automation opportunities. The integration of this is still mostly done through manual development, and even when leveraging existing APIs, it still requires con- siderable technical/programming skills. 1.2 Motivation and Problems To understand the challenges related to building integrated applications using APIs, we examine the following case-study: 3 Code Review & Development Cycle. Version Control Systems (VCS) are very common in software engineering - they help avoid collision and improve traceability. While it is important to find where the bug is introduced and revert it, peer review also helps to bring forward discovery of such bugs. Github is one of the most popular online open-source repositories for code. Likewise, Pivotal Tracker (PT) offers a good story-tracking system, to help the team keep track of their progress. To integrate these tools together, Github provides built-in integration via enabling to post commit messages that contains specific keywords to track and update the state of a story on PT. A typical workflow implementing such integrated process over Github and Pivotal Tracker APIs might look as follows: 1. Project Manager PM creates a Story and assigned to Engineer 2. Engineer starts working on the Story 3. Engineer completes programming task and pushes onto Github 4. Engineer finishes and delivers the Story 5.PM accepts/rejects the delivery Effectively, the PT integration provided by Github parses the commit messages looking for syntax in the form of: “#(story number) ”, such as: [Starts #12345, #23456] ... [Finishes #12345] ... [Delivers #12345]. If any such messages are detected, the corresponding action will be performed in PT. For example, if the engineer commit message contains [Finishes #12345], when Github receives this commit, it will automatically mark the story as finished in PT. This helps simplify the workflow by eliminating the otherwise manual work done within PT. Such typical integration method may lead to the following lim- itations: 4 Fixed workflow: While this inbuilt integration might work nicely to eliminate the manual creation, start, finish and delivery of a PT “story” | it provides little flexibility to adapt to different development cycles. For example, the notion of “continuous integration (CI)” is prominent in software engineer today. This calls for “continuous” testing whenever new changes are made, and in some cases even responsible for the building and deploying of the changes in some configurations. Therefore, if a particular development team decides to adopt CI, this would significantly change the semantics of the deliver action of a story in PT. This means, at Step 4, we may want to introduce additional steps such as: testing and deployment before closing this change. Unfortunately however, the current Github/PT integration would not apply to such development environments, [110]. Shadow Processes: To achieve the above result, it is thus likely the development team would integrate other non-related tools or even conduct the “code review” process manually. In fact, after a code commit on a feature branch, the review should then be initialized. After the review is completed, it should then be merged into the main repository (or master branch).