Masaryk university, Faculty of Informatics

Ph.D. Thesis

System Integration in Web 2.0 Environment

Pavel Drášil

Supervisor: doc. RNDr. Tomáš Pitner, Ph.D.

January 2011 Brno, Czech Republic Except where indicated otherwise, this thesis is my own original work.

Pavel Drášil Brno, January 2011 Acknowledgements

There are many people who have influenced my life and the path I have chosen in it. They have all, in their own way, made this thesis and the work described herein possible.

First of all, I have to thank my supervisor, assoc. prof. Tomáš Pitner. Anytime we met, I received not only a number of valuable advices, but also a great deal of enthusiasm, understanding, appreciation, support and trust. Further, I have to thank my colleague Tomáš Ludík for lots of fruitful discussions and constructive criticism during the last year.

Without any doubt, this thesis would not come into existence without the love, unceasing support and patience of the people I hold deepest in my heart – my parents and my beloved girlfriend Míša. I am grateful to them more than I can express. Abstract

During the last decade, Service-Oriented Architecture (SOA) concepts have matured into a prominent architectural style for enterprise application development and integration. At the same time, Web 2.0 became a predominant paradigm in the web environment. Even if the ideological, technological and business bases of the two are quite different, significant similarities can still be found. Especially in their view of services as basic building blocks and service composition as a way of creating complex applications.

Inspired by this finding, this thesis aims to contribute to the state of the art in designing service-based systems by exploring the possibilities and potential of bridging the SOA and Web 2.0 worlds. Specifically, it examines the feasibility and potential benefits of applying enterprise-level SOA-related technologies such as SOAP, ESB and BPEL for Web 2.0 service integration. With this approach, Web 2.0 services could be incorporated into existing enterprise systems easily and as a result, Web 2.0 services could become more appealing to companies and institutions, where they could be used as a specific type of outsourcing. Although the thesis is concerned primarily with the technological issues arising from the combination of the two different worlds, legal and other non-functional aspects of the approach in question are also taken into account.

The “Web 2.0 Platform”, introduced in this thesis, is a generic system integration framework built using SOA-related technologies with extended support for Web 2.0 service integration. It facilitates applying enterprise-oriented technologies in Web 2.0 service integration or – from the opposite point of view – employing Web 2.0 services in enterprise-oriented systems. The proposed framework allows creating enterprise-strength, industry-standard-based applications backed by the functionality of Web 2.0 services as well as other software systems. Design of the framework is based on an in-depth understanding of the rationale behind both the SOA and Web 2.0 concepts. For SOA, this is not an issue, because lots of resources are available and the technological bases of current enterprise-targeted SOA-based systems are quite uniform. In contrast, Web 2.0 is a less predetermined and potentially more heterogeneous environment. To obtain a realistic view of various aspects of current Web 2.0 environment, an in-depth analysis was performed, covering a total of 21 representative Web 2.0 services. Results of the analysis constituted a reasonable basis for designing the integration framework.

To evaluate the feasibility and applicability of the proposed concept in nearly realistic conditions, a case study was conducted. An integrated learning environment was built using the framework and several Web 2.0 services, providing support for several learning patterns for the Inclusive Universal Access educational philosophy.

Keywords

Software service, Web 2.0, service-oriented architecture, service integration, service composition. Table of Contents

1. Introduction ��������������������������������������������������������������������������������������������������������������������������������������� 12 1.1 Motivation �������������������������������������������������������������������������������������������������������������������������������� 12 1.2 Research question ��������������������������������������������������������������������������������������������������������������������� 13 1.3 Objectives ��������������������������������������������������������������������������������������������������������������������������������� 13 1.4 Contribution to State of the art ������������������������������������������������������������������������������������������������� 13 1.5 Thesis overview ������������������������������������������������������������������������������������������������������������������������ 14 1.6 Conventions ������������������������������������������������������������������������������������������������������������������������������ 14 2. State of the art ����������������������������������������������������������������������������������������������������������������������������������� 15 2.1 Software architecture ���������������������������������������������������������������������������������������������������������������� 15 2.1.1 Documenting software architectures �������������������������������������������������������������������������������� 16 2.1.2 Software architectural styles �������������������������������������������������������������������������������������������� 19 2.2 Service-orientation ������������������������������������������������������������������������������������������������������������������� 20 2.2.1 Services ���������������������������������������������������������������������������������������������������������������������������� 20 2.2.2 Specific issues ������������������������������������������������������������������������������������������������������������������ 22 2.2.3 The way towards service-orientation ������������������������������������������������������������������������������� 22 2.3 Service-oriented architectures �������������������������������������������������������������������������������������������������� 28 2.3.1 Principles �������������������������������������������������������������������������������������������������������������������������� 29 2.3.2 Technologies ��������������������������������������������������������������������������������������������������������������������� 35 2.3.3 Development style ������������������������������������������������������������������������������������������������������������ 48 2.3.4 Deployment ���������������������������������������������������������������������������������������������������������������������� 50 2.3.5 Evaluation ������������������������������������������������������������������������������������������������������������������������� 51 2.4 Web 2.0 architectures ��������������������������������������������������������������������������������������������������������������� 51 2.4.1 Principles �������������������������������������������������������������������������������������������������������������������������� 53 2.4.2 Technologies ��������������������������������������������������������������������������������������������������������������������� 55 2.4.3 Development style ������������������������������������������������������������������������������������������������������������ 61 2.4.4 Deployment ���������������������������������������������������������������������������������������������������������������������� 61 2.4.5 Evaluation ������������������������������������������������������������������������������������������������������������������������� 62 2.5 Relation of Web 2.0 and SOA ��������������������������������������������������������������������������������������������������� 63 2.6 Web 2.0 in the enterprise ���������������������������������������������������������������������������������������������������������� 65 2.7 Summary ����������������������������������������������������������������������������������������������������������������������������������� 66 3. Reality of Web 2.0 services �������������������������������������������������������������������������������������������������������������� 68 3.1 Selection of representative services ������������������������������������������������������������������������������������������ 68 3.2 Technological aspects ��������������������������������������������������������������������������������������������������������������� 70 3.2.1 Communication protocols ������������������������������������������������������������������������������������������������ 70 3.2.2 Messaging models ������������������������������������������������������������������������������������������������������������ 70 3.2.3 Data formats ��������������������������������������������������������������������������������������������������������������������� 71 3.2.4 Authentication ������������������������������������������������������������������������������������������������������������������ 72 3.3 Developer support �������������������������������������������������������������������������������������������������������������������� 74 3.4 Legal aspects ���������������������������������������������������������������������������������������������������������������������������� 75 3.4.1 Terms of usage ����������������������������������������������������������������������������������������������������������������� 75 3.4.2 Content licensing �������������������������������������������������������������������������������������������������������������� 76 4. Proposed integration framework ������������������������������������������������������������������������������������������������������� 78 4.1 Requirements ���������������������������������������������������������������������������������������������������������������������������� 78 4.2 Analysis and Design ����������������������������������������������������������������������������������������������������������������� 80 4.2.1 Platform core �������������������������������������������������������������������������������������������������������������������� 82 4.2.2 Connectors ������������������������������������������������������������������������������������������������������������������������ 89 4.2.3 Design ������������������������������������������������������������������������������������������������������������������������������� 92 4.3 Implementation ������������������������������������������������������������������������������������������������������������������������� 95 4.3.1 Security ����������������������������������������������������������������������������������������������������������������������������� 96 4.4 Deployment ������������������������������������������������������������������������������������������������������������������������������ 97 4.5 Choosing services for integration ��������������������������������������������������������������������������������������������� 98 4.5.1 Functional criteria ������������������������������������������������������������������������������������������������������������ 98 4.5.2 Non-functional criteria ��������������������������������������������������������������������������������������������������� 100 5. Framework in action – Case study �������������������������������������������������������������������������������������������������� 102 5.1 Motivation ������������������������������������������������������������������������������������������������������������������������������ 102 5.2 Use-cases �������������������������������������������������������������������������������������������������������������������������������� 102 5.3 Design and implementation ���������������������������������������������������������������������������������������������������� 103 5.3.1 Team building ����������������������������������������������������������������������������������������������������������������� 104 5.3.2 Reaction sheets ��������������������������������������������������������������������������������������������������������������� 104 5.3.3 Diary ������������������������������������������������������������������������������������������������������������������������������� 105 5.3.4 Team workspaces ����������������������������������������������������������������������������������������������������������� 106 5.3.5 Market ���������������������������������������������������������������������������������������������������������������������������� 107 5.4 Overall schema ����������������������������������������������������������������������������������������������������������������������� 108 5.5 Client application �������������������������������������������������������������������������������������������������������������������� 109 6. Assessment of the framework ����������������������������������������������������������������������������������������������������������110 6.1 Benchmarking ��������������������������������������������������������������������������������������������������������������������������111 7. Conclusions ��������������������������������������������������������������������������������������������������������������������������������������113 Bibliography �����������������������������������������������������������������������������������������������������������������������������������������115 Glossary ����������������������������������������������������������������������������������������������������������������������������������������������� 126 List of Figures

Figure 2.1: 4+1 views ���������������������������������������������������������������������������������������������������������������������������� 17 Figure 2.2: Siemens Four Views ������������������������������������������������������������������������������������������������������������ 17 Figure 2.3: Conceptual framework of IEEE-1471 (selected concepts only) ����������������������������������������� 18 Figure 2.4: Software service-related concepts ��������������������������������������������������������������������������������������� 21 Figure 2.5: Typical scenario of interactions between service provider, consumer, supplier and user ��� 22 Figure 2.6: In-only and In-Out message exchange patterns ������������������������������������������������������������������ 23 Figure 2.7: General architecture for distributed object systems ������������������������������������������������������������ 25 Figure 2.8: Service hierarchy ����������������������������������������������������������������������������������������������������������������� 34 Figure 2.9: Java architecture for XML binding ������������������������������������������������������������������������������������� 36 Figure 2.10: Service Data Objects and Data Access Service ����������������������������������������������������������������� 37 Figure 2.11: Web services architecture stack ����������������������������������������������������������������������������������������� 38 Figure 2.12: Point-to-point and broker-based architecture �������������������������������������������������������������������� 40 Figure 2.13: Basic ESB architecture ������������������������������������������������������������������������������������������������������ 42 Figure 2.14: Top-level view of the JBI architecture ������������������������������������������������������������������������������ 43 Figure 2.15: SCA elements and their relations �������������������������������������������������������������������������������������� 44 Figure 2.16: A WSRP-enabled portal, aggregating markup from remote portlets ��������������������������������� 46 Figure 2.17: Genealogy of standards in business process modelling and execution ����������������������������� 48 Figure 2.18: Activities involved in business-driven development ��������������������������������������������������������� 50 Figure 2.19: Perpetual beta product cycle ���������������������������������������������������������������������������������������������� 54 Figure 2.20: Mash-tree, an allegory of mashup principles �������������������������������������������������������������������� 58 Figure 2.21: Client-side mashing ����������������������������������������������������������������������������������������������������������� 59 Figure 2.22: Server-side mashing ���������������������������������������������������������������������������������������������������������� 60 Figure 2.23: Convergence of Web 2.0 and SOA ������������������������������������������������������������������������������������ 63 Figure 2.24: WOA technological stack �������������������������������������������������������������������������������������������������� 64 Figure 2.25: Enterprise mashup ������������������������������������������������������������������������������������������������������������� 65 Figure 2.26: Enterprise mashup development layers ����������������������������������������������������������������������������� 66 Figure 4.1: Web 2.0 Platform use-case diagram ������������������������������������������������������������������������������������ 80 Figure 4.2: Microkernel architectural pattern ���������������������������������������������������������������������������������������� 81 Figure 4.3: Example Web 2.0 Platform architecture (logical component diagram) ������������������������������ 82 Figure 4.4: Web 2.0 Platform core use-case diagram ���������������������������������������������������������������������������� 83 Figure 4.5: Logical ERD for user account management and management of users’ credentials ���������� 84 Figure 4.6: External user authentication (using OpenID) ���������������������������������������������������������������������� 86 Figure 4.7: Processing a connector request with notifications ��������������������������������������������������������������� 87 Figure 4.8: Web 2.0 Platform primary connector use-case diagram ������������������������������������������������������ 89 Figure 4.9: Processing an operation in a primary connector ����������������������������������������������������������������� 90 Figure 4.10: ESB-based design of the Web 2.0 platform ����������������������������������������������������������������������� 92 Figure 4.11: Web 2.0 platform core services ����������������������������������������������������������������������������������������� 93 Figure 4.12: Retrieving the description of a connector interface ����������������������������������������������������������� 94 Figure 4.13: Processing a connector request inside the platform core �������������������������������������������������� 94 Figure 4.14: Basic Web 2.0 platform deployment ��������������������������������������������������������������������������������� 97 Figure 5.1: Sequence of the “Team building” learning pattern ����������������������������������������������������������� 104 Figure 5.2: Interface of the TeamBuilding connector �������������������������������������������������������������������������� 104 Figure 5.3: Sequence of the “Reaction sheets” learning pattern ���������������������������������������������������������� 105 Figure 5.4: Interface of the ReactionSheets connector ������������������������������������������������������������������������ 105 Figure 5.5: Sequence of the “Diary” learning pattern �������������������������������������������������������������������������� 105 Figure 5.6: Interface of the TeamDiaries connector ���������������������������������������������������������������������������� 106 Figure 5.7: Sequence of the “Team workspaces” learning pattern ������������������������������������������������������ 106 Figure 5.8: Interface of the TeamWorkspaces connector ��������������������������������������������������������������������� 107 Figure 5.9: Sequence of the “Market” learning pattern ����������������������������������������������������������������������� 107 Figure 5.10: Interface of the Market connector ����������������������������������������������������������������������������������� 108 Figure 5.11: Implementation dependencies among connectors ����������������������������������������������������������� 108 List of Tables

Table 2.1: Overall comparison of SOA and Web 2.0 ����������������������������������������������������������������������������� 67 Table 3.1: Online Web 2.0 directories ���������������������������������������������������������������������������������������������������� 68 Table 3.2: Representative Web 2.0 services ������������������������������������������������������������������������������������������� 69 Table 3.3: Messaging models used by the reviewed Web 2.0 services �������������������������������������������������� 71 Table 3.4: Data formats used by the reviewed Web 2.0 services ����������������������������������������������������������� 72 Table 3.5: Authentication methods used by the reviewed Web 2.0 services ������������������������������������������ 73 Table 3.6: Examples of programming languages for which one can find libraries for accessing APIs of the reviewed Web 2.0 services ����������������������������������������������������������������������������������������� 75 Table 6.1: Attainable throughput of the framework on various hardware configurations ��������������������112 Table 6.2: Performance of selected Web 2.0 service APIs ��������������������������������������������������������������������112 List of Abbreviations

ADL Architecture Description Language AJAX Asynchronous JavaScript and XML API Application Programming Interface B2B Business to business BC Binding Component BPEL Business Process Execution Language BPMN Business Process Management Initiative BPMN Business Process Modelling Notation CORBA Common Architecture CSCW Computer-supported Cooperative Work CSV Comma-separated Values DAS Data Access Service EAI Enterprise Application Integration EIP Enterprise Integration Pattern EJB Enterprise Java Beans ESB Enterprise Service Bus FTP File Transfer Protocol GOF Gang of Four GUI Graphical User Interface HTML Hypertext Markup Language HTTP Hypertext Transfer Protocol HTTPS Hypertext Transfer Protocol Secure IaaS Infrastructure as a Service IDE Integrated Development Environment IMAP Internet Message Access Protocol IoS Internet of Services IT Information Technology IUA Inclusive Universal Access JAAS Java Authentication and Authorization Service JAXB Java Architecture for XML Binding JBI Java Business Integration JCA Java EE Connector Architecture JCP Java Community Process JDBC Java Database Connectivity JMS Java Message Service JSON JavaScript Object Notation JSR Java Specification Request JVM Java Virtual Machine KML Keyhole Markup Language LDAP Lightweight Directory Access Protocol MDA Model-Driven Architecture MEP Message Exchange Pattern MSOAM Mainstream SOA Methodology OMG Object Management Group OO Object-Oriented OOAD Object-Oriented Analysis and Design PaaS Platform as a Service PC Personal Computer PNG Portable Network Graphics POJO Plain Old Java Object POP3 Post Office Protocol version 3 QoS Quality of Service REST Representational State Transfer RFC Request For Comments RIA Rich Internet Application RMI Remote Method Invocation RPC Remote Procedure Call RSS Really Simple Syndication or Rich Site Summary RUP Rational Unified Process SaaS Software as a Service SCA Service Component Architecture SDO Service Data Objects SE Service Engine SLA Service Level Agreement SMTP Simple Mail Transfer Protocol SOA Service-Oriented Architecture SOAP Simple Object Access Protocol SOMA Service-Oriented Modelling and Architecture SOMF Service-Oriented Modelling Framework SSL Secure Sockets Layer SU Service Unit ToS Terms of Service UDDI Universal Description, Discovery and Integration UML Unified Modelling Language UP Unified Process W3C World Wide Web Consortium WADL Web Application Description Language WCF Windows Communication Foundation WOA Web-Oriented Architecture WS Web Services WSDL Web Services Description Language WS-I Web Services Interoperability Organization WSRP Web Services for Remote Portlets XML Extensible Markup Language XMPP Extensible Messaging and Presence Protocol XSLT Extensible Stylesheet Language Transformations 1. Introduction

1.1 Motivation

The World Wide Web was developed by Sir Tim Berners-Lee (1990). Now, almost 20 years later, its basic ideas and technologies are still the same – server-provided HTML code is transported to client over HTTP protocol and viewed there in a web browser. However, something has obviously changed in early 2000s after the “dot-com bubble” burst. Since those days, the web gradually ceased to be a matter of skilled enthusiasts and started its way to being a live environment for masses.

In reaction to this change, the “Web 2.0 Conference” was held in 2004, effectively establishing the name for the phenomenon. One year later, Tim O’Reilly (2005) made the first serious attempt to define the term Web 2.0. The definition itself is not straightforward and we will elaborate on it later in this thesis. For now, we will consider just a few characteristic excerpts from O’Reilly’s ground-breaking article: • “one of the defining characteristics of Internet era software is that it is delivered as a service, not as a product”; • “support lightweight programming models that allow for loosely coupled systems”; • “design for hackability and remixability”; • “innovation in assembly”. In other words – service is the core concept of Web 2.0 and composition of individual services is the main source of innovation in Web 2.0 environment.

However, the idea of building complex systems by composing loosely coupled standalone services is not unique to Web 2.0. The principles of service orientation are inherent also in Service- Oriented Architecture (SOA), one of the most prominent software architectural styles of these days (Rosen et al., 2008; Gartner, 2008; Gartner, 2009). There are obvious differences between the two – they are usually implemented using different technologies, they are usually targeting different types of users, they usually use different business models. But there are also similarities, making some authorities to consider Web 2.0 a global instance of SOA (Hinchcliffe, 2005) or a universal SOA (Geelan, 2007).

The similarities between Web 2.0 and SOA have naturally led to attempts to combine the two on various levels. Most notably in the form of enterprise mashups (Hoyer et al., 2008b), i.e. web applications that use and combine data, presentation or functionality from multiple sources, possibly including the existing SOA-based enterprise applications. This thesis supplements these efforts by elaborating on the opposite approach – on utilization of modern Web 2.0 applications through their APIs in traditional SOAs. Not because there is something wrong with the idea of enterprise mashups. The two approaches are completely independent and can be either combined or used separately. The main motivation for exploring the opposite approach is that we feel it more natural and easier to adopt for the established enterprise application landscapes.

Web 2.0 service APIs can definitely provide value even when being used on their own. But they are often also gateways to reaching a much richer functionality, delivered by multiple means and/or channels. One can benefit from services’ web interfaces, RSS/ATOM feeds, web browser plugins, desktop integrations and possibly other features (e.g. SMS notifications in calendar services). In addition, the number of applications provided online in the SaaS manner with appealing pricing models and low total cost of ownership is growing steadily, creating a general demand for supplementing or even replacing parts of existing enterprise application portfolios with these types of applications (Bughin, 2010; McAffee, 2010).

- 12 - The thesis is concerned primarily with the technological issues arising from the combination of the two different worlds. However, legal and other non-functional aspects of the approach in question are considered too.

1.2 Research question

The research question posed in this thesis is whether it is feasible to apply enterprise-level SOA-related technologies such as SOAP, ESB and BPEL for Web 2.0 service integration and, if so, what are the potential benefits and limitations of taking this way.

1.3 Objectives

In answering the research question posed, the main objective of this thesis is to propose and evaluate a service-integration framework that should satisfy the following basic criteria: • it allows integration of existing Web 2.0 services so that they can be used as data or functionality providers; • it allows orchestration of the integrated services so that they can be used in cooperation for realizing complex use-cases; • it allows integration with legacy systems and services.

In order to fulfil the objective of this thesis the following activities have to be undertaken: • detailed analysis of service-enabling, service integration and service composition technologies and techniques used in current SOA environments; • detailed analysis of service-enabling, service integration and service composition technologies and techniques used in the Web 2.0 environment; • detailed analysis of non-functional characteristics of a representative set of current Web 2.0 services; • design of a general service-integration framework satisfying the above-mentioned criteria; • prototype implementation of the proposed framework; • evaluation of the framework in a case study.

1.4 Contribution to State of the art

This thesis elaborates on a novel approach to Web 2.0 service integration. Its novelty does not lie in introducing new technologies, principles or paradigms. Rather, it explores the potential of putting together a set of well-known technologies in an uncommon way. As far as I know, no similar study is available at the moment.

This thesis contributes to the state of the art in three aspects. First, it provides an exhaustive overview of non-functional aspects of current Web 2.0 services based on the examination of real Web 2.0 services. Similar surveys were already conducted, e.g. by Novak and Voigt (2007) or by Drášil et al (2008), but this thesis provides much more details and updated information. Second and the most important contribution of this thesis is in answering the research question posed in Section 1.2. Knowing the answer, one can consider all the predictable consequences of integrating Web 2.0 services into an existing enterprise application landscape this particular way and make the right decision. This could help making Web 2.0 services more appealing to companies and institutions, which could employ them as a specific type of outsourcing where appropriate and prevent avoidable failures. Third contribution is in the proposed integration framework. It is ready to be used for integrating and composing numbers of

- 13 - Web 2.0 services either on their own or together with some enterprise applications. And even when it is not used directly, its requirements and design specifications can be used as a good source of information for other realizations.

1.5 Thesis overview

The structure of this thesis stems to a large extent from the activities identified in Section 1.3. After a short introduction in Chapter 1 it effectively starts with Chapter 2, reviewing the state of the art extensively. To provide a general introduction, software architectures and their general aspects are mentioned first. Subsequently, the scope of the review is narrowed to service-oriented software architectures by introducing and elaborating on the general principles of service-orientation. The two relevant approaches to designing and developing service-oriented software, SOA and Web 2.0, are then described in detail, including their distinguishing principles, typical technologies, development styles and deployment environments. The chapter is concluded with some thoughts about the relation of the two approaches and about the growing significance of Web 2.0 ideas and techniques for enterprise environments, traditional preserves of SOAs.

Chapter 3 is devoted to an in-depth analysis of various non-functional aspects of real Web 2.0 services. A representative set of specific Web 2.0 services is selected and examined from several viewpoints, both technological and non-technological. The analysis is performed for two reasons. Firstly, it should either confirm or refute the theoretical assertions made in the previous chapter. Secondly, findings will be used for designing the integration framework and for choosing the right services for applications built upon the proposed framework.

Chapter 4 introduces the “Web 2.0 Platform”, a service integration framework designed specifically for integrating Web 2.0 services and making it possible to integrate them into common SOAs seamlessly. Sections of this chapter correspond with the disciplines of the Unified Process software development methodology and illustrate the genesis of the framework from requirements specification over analysis, design and implementation phases up to deployment. The last section is not related to the framework itself, but provides general guidelines for choosing the Web 2.0 services to be integrated using it.

In Chapter 5, the proposed integration framework is evaluated by conducting a case study. It was employed for developing a simple learning environment, making use of several Web 2.0 services. Based on experiences from the case study, the framework is assessed and benchmarked in Chapter 6. In conclusion, Chapter 7 provides the answer to the research question by abstracting from the features specific for the proposed framework and taking a generalized position on the Web 2.0 service integration approach in question.

1.6 Conventions

Throughout the text, all headings at level 1 (top-level headings) are referred to as “Chapters”, while all lower heading levels within Chapters are referred to as “Sections”.

Names of specific companies, software products and Web 2.0 services are often provided as examples confirming the assertions stated throughout the thesis. To keep the text coherent and easy to read, web links or formal citations are not provided for each of them. When necessary, a simple “I’m feeling lucky” Google search will be enough to find it.

- 14 - 2. State of the art

This chapter provides a technological background for answering the research question. It starts with a general introduction into the field of software architectures and documenting thereof. Then, it concentrates on the concepts of traditional service-oriented architectural styles. SOA, the latest successful evolutionary step in this field, is described in detail. In the same way, novel service-related approaches introduced recently in the Web 2.0 environment are presented and thus contrasted with SOA solutions.

2.1 Software architecture

Unfortunately, as with many other terms mentioned in this thesis, there is a lack of consensus about the definition of the term itself. There are dozens of definitions of the term software architecture available in scientific papers, books, encyclopaedias, specifications, whitepapers etc. A list containing even several hundreds of definitions, mostly provided by the community, is maintained by the Software Engineering Institute of the Carnegie Mellon University and available online1.

I decided not to take a strict position about this issue. I will neither advocate one of the existing definitions nor formulate a new one. Instead, I will provide the reader with a few noteworthy examples illustrating the diversity and evolution in viewing the meaning and the scope of the term software architecture:

“Software Architecture = {Elements, Form, Rationale} That is, a software architecture is a set of architectural (or, if you will, design) elements that have a particular form. We distinguish three different classes of architectural elements: processing elements; data elements; and connecting elements. The processing elements are those components that supply the transformation on the data elements; the data elements are those that contain the information that is used and transformed; the connecting elements (which at times may be either processing or data elements, or both) are the glue that holds the different pieces of the architecture together.” (Perry and Wolf, 1992, p.44)

The very first recorded reference to the phrase “software architecture” occurred presumably in 1969 at a conference on software engineering techniques organized by NATO (Buxton, 1970). However, the above-mentioned definition and especially the formula it starts with is often cited as the first serious attempt to specify the software architecture formally. Software architecture is defined as a set of architectural elements that have a particular form, explicated by a set of rationale. This definition has been subject to many subsequent modifications e.g. by Kogut (1994) who replaced elements and form with components, connections and constraints or by Fielding (2000) who excluded rationale. However, its basic idea can be found in almost any other definition.

“As the size and complexity of software systems increases, the design problem goes beyond the algorithms and data structures of the computation; designing and specifying the overall system structure emerges as a new kind of problem. Structural issues include gross organization and global control structure; protocols for communication, synchronization, and data access; assignment of functionality to design elements; physical distribution; composition of design elements; scaling and performance; and selection among design alternatives. This is the software architecture level of design.” (Garlan and Shaw, 1993, p. 1)

1 http://www.sei.cmu.edu/architecture/start/definitions.cfm

- 15 - This early influential definition comes with the fundamental finding that software development is no longer just a question of algorithms and data structures. That the overall system structure is another important factor that has to be taken into consideration. It is not as formal as the previous definition, but on the other hand it provides an exhaustive list of problems or problem areas that should be addressed during system design.

[Architecture is]“The fundamental organization of a system embodied in its components, their relationships to each other, and to the environment, and the principles guiding its design and evolution.” (IEEE, 2000, p. 9)

This widely-accepted definition published in IEEE standard 1471-2000 also takes a largely structural perspective. Software is considered a set of interacting components, structured with respect to some system-wide design principles (in this context the term component has a very loose sense meaning just an identifiable “chunk” of software). And as a new aspect, evolution of the system is taken into account.

“Architecture is about the important stuff. Whatever that is.” (Fowler, 2003, p. 3)

“Software architecture is the set of design decisions which, if made incorrectly, may cause your project to be cancelled.” (Eoin Woods)

The last two mentioned definitions propose quite a different view of software architecture. They are by no means the least specific ones. They do not limit software architecture to internal structure or any other single aspect of the system. They simply define software architecture as a set of design decisions, being vital for the project no matter what they refer to. Such a conception can be considered somewhat vague and controversial, but anyway it clearly shows the importance of software architecture in a software development project.

2.1.1 Documenting software architectures

Software documentation takes time to develop and costs money. But especially large projects and projects with dislocated developers can hardly do without it. When it comes to documenting software architectures, there are many possible ways to go. When reading the definitions of software architecture in the previous section, especially the early ones, one may easily get the impression that a single box- and-line diagram is enough to capture the structure of particular elements of the system and connections among them. And, naturally, this is how documenting of software architectures started.

2.1.1.1 Views and viewpoints

Complex things can be hardly described using a single model. Shortly after reaching a consensus on what software architecture actually is (even if, as stated in the previous section, there is still some disagreement on this topic), it became clear that there are many possible ways of looking at and understanding software architectures. Therefore one aspect common to all the mature methods for documenting software architectures is the principle of separation of concerns. The description is decomposed into several separate but interrelated views, each capturing specific aspects of the architecture in question.

- 16 - Kruchten (1995) of Rational Software Corporation developed the 4+1 view model of software architecture. His approach has later been embraced as a foundation of the famous Rational Unified Process (RUP) software development methodology: • logical view, capturing structure, i.e. architecturally significant elements and their relations; • process view, describing run-time characteristics, i.e. processes, concurrency and synchronization; • development view, focusing on the internal organization of source codes; • physical view, dealing with mapping of processes and components onto hardware; • a few architecturally significant scenarios/use-cases illustrating the coherence of all the views.

End-user Programmers Functionality Software management

Development Logical View View

Scenarios

Process View Physical View

Integrators System engineers Performance Topology Scalability Communications Figure 2.1: 4+1 views (Kruchten, 1995)

At about the same time, Soni et al. (1995) developed a similar set of views for Siemens Corporate Research when analysing software architectures of several existing industrial systems. Their approach has become known as the Siemens Four View model. They used: • conceptual view, describing major design elements and the relationships among them; • module interconnection view, encompassing functional decomposition and layers; • execution view, dealing with dynamic run-time structure of a system; • code view, covering the organization of source codes, binaries and libraries.

Figure 2.2: Siemens Four Views (Clements et al., 2002)

- 17 - Since these early approaches, there has been much experience and development. Most notably, the IEEE Std. 1471 (2000) generalized the idea of a fixed set of views by introducing the concept of a viewpoint. A viewpoint (some authors use the term viewtype instead) defines the perspective from which a view is taken. More specifically, it defines how to construct and use a view, the information that should appear Visualin theParadigm view, for UML Standard the Edition(Masarykmodelling University) techniques for expressing and analysing the information and a rationale for these choices. An architectural description should select the viewpointsplaceholder for use therein and appropriate views should be developed.

System has Architecture

has 1..* described by Stakeholder identifies Architectural Description 1..* is important to 1..* is addressed to 1..*

has 1..* selects 1..* organized by 1..* Concern covers Viewpoint conforms to View 1..*

Figure 2.3: Conceptual framework of IEEE-1471 (selected concepts only)

The distinction between views and viewpoints proved to be very useful. However, the IEEE Std. 1471 does not prescribe any specific viewpoints, leaving a free space for viewpoint catalogues and individual methodologies built upon the standard. For example, Clements et al. (2002) recommend capturing a software architecture using the three general viewtypes: • module viewtype, covering decomposition of the system into code modules; • component and connector viewtype, covering run-time behaviour of the system; • allocation viewtype, covering mapping of software units onto elements of the environment.

Another approach is presented in (Rozanski, 2005) and includes six core viewpoints for information systems architecture: • functional viewpoint, covering functional elements, their responsibilities, interfaces, and interactions; • information viewpoint, covering storage, manipulation, management, and distribution of information; • concurrency viewpoint, covering concurrency structure of the system and mapping of functional elements to concurrency units; • development viewpoint, covering the architecture that supports the software development process; • deployment viewpoint, covering the environment into which the system will be deployed; • operational viewpoint, covering how the system will be operated, administered, and supported.

2.1.1.2 Notations

Having a good notation for describing various aspects of software architectures is another key issue. It has to be powerful enough to capture all the important features and at the same time it should be as lucid as possible. First box-and-line notations gave priority to lucidity but suffered from low expressiveness

- 18 - and semantic ambiguity. In reaction, many attempts to create a formal notation for describing software architectures were made, resulting in a plethora of architecture description languages (ADLs) including C2, Rapide, Darwin, Wright, ACME/ADML, UniCon, Koala, xADL or AADL. Some of them are designed specifically for a single architectural style, others attempt to be as generic as possible. Some are implementation-focused, others serve just for analytic purposes.

“We now have many rich ADLs. But few are in practical use except perhaps Koala and perhaps UML if you consider it an ADL (many ADL purists do not)” (Kruchten et al., 2006, p.25)

ADL community has been influential in the science of software architecture, but has had limited impact on mainstream information technology. As a result, Unified Modelling Language (UML) has become an industry-standard ADL (Shaw, 2006). Although UML was not originally developed for modelling software architectures, it has undergone several major revisions and constructs have been added that are useful when documenting software architectures (Clements et al., 2002; Gorton, 2006).

2.1.2 Software architectural styles

Similar to the architecture of buildings, software architecture is a continuously evolving discipline. Evolution of both of them is driven by similar factors – personal creativity, past experience and technological innovations. New ideas, new experiences and new technologies give rise to new architectural styles.

“An architectural style is a coordinated set of architectural constraints that restricts the roles/features of architectural elements and the allowed relationships among those elements within any architecture that conforms to that style” (Fielding, 2000, p.13).

A definition of an architectural style is always dependant on the selection of the definition of software architecture. The above-mentioned definition by Fielding is clearly based on classical, i.e. structurally- oriented definitions of software architecture. If we adopt any of the modern, less restrictive definitions of software architecture, the definition of architectural style would have to be also less specific, e.g.:

“Architectural styles are named collections of architectural design decisions that (1) are applicable in a given development context, (2) constrain architectural design decisions that are specific to a particular system within that context, and (3) elicit beneficial qualities in each resulting system” (Taylor et al., 2009) or

“An architectural style is a family of architectures related by common principles and attributes” (Hubert, 2001)

Anyway, architectural styles are a mechanism for categorizing architectures and for defining their common characteristics. They offer well-established solutions to architectural problems, help to document the architectural design decisions and facilitate communication between stakeholders through a common vocabulary. Architectural styles consolidate and integrate structural, procedural, descriptive and business aspects that would be otherwise addressed separately. Many well-established architectural styles were developed over the time, for example service-oriented, component-based or agent-based architectures.

- 19 - Unfortunately, the terminology is not fixed and some authors use the term “architectural pattern” instead of style. The word “pattern” was popularized in the computer science by the “Gang of Four” (GOF) and it used to be used mainly in the object-oriented design, i.e. in a rather low-level, implementation-oriented scope. However, it turned out that some object-level design principles can be successfully applied also to higher-level design elements and vice versa. To allow a distinction among various levels of patterns, this dissertation will adopt the terminology and classification used in (Buschmann et al., 1996): • architectural pattern, a fundamental structural organization or schema for software (sub)systems (e.g. layers, broker and model-view-controller patterns); • design pattern, a commonly recurring structure of communicating components that solves a general design problem within a particular context (most GOF patterns belong to this category, e.g. singleton, decorator and facade patterns); • idiom, a low-level pattern specific to a (e.g. iterating a collection in Java).

These three pattern categories fit nicely with the concept of architectural style as described above – a higher-level concept covering the entire software life cycle and going beyond the scope of software development.

2.2 Service-orientation

“… the fundamentals of engineering like good abstractions, good separation of concerns never go out of style.” (Booch, 2004)

Separation of concerns is a natural approach to solving any kind of non-trivial problem. In computer science, it means dividing an application into distinct parts that overlap as little as possible. Such a separation can be, and usually is, carried out in multiple levels, e.g. functional, technological, personal and organisational. Service-orientation is a design paradigm, addressing the issue of functional decomposition by separating application business logic into individual well-defined services, performing discrete units of work.

This section presents basic concepts of service-orientation, refers to specific issues inherent to this design paradigm and briefly sketches the evolution leading to the establishment of service-orientation in information technologies during the last decade.

2.2.1 Services

Various kinds of services and providing thereof have been a natural part of human culture from of old. Because of their intangible nature, services have several specific properties, differentiating them from other types of goods or products. They cannot be stored for future use, production and consumption thereof have to occur simultaneously and they are often inseparable from their providers. In the most general sense, a service can be defined as follows:

[A service is]“any act or performance that one party can offer to another that is essentially intangible and does not result in the ownership of anything” (Kotler, 1988)

Services are offered by a service provider to service consumer(s). However, more parties can be involved in service provision in real settings: • service supplier, actually implementing the service; • service provider, offering the service;

- 20 - • service customer, making a contract with service provider and providing funds; • service user, making use of the service.

Service provider and service customer enter into an agreement, generally called a Service Level Agreement (SLA). SLA determines what services are to be offered by the provider and to be used by the consumer. It must specify the measurable levels of those services that the provider must achieve, and the terms of use that the customer must comply with (Allen, 2006).

Visual Paradigm for UML Standard Edition(Masaryk University) The nuances in understandingTaken from the term(Allen, “service” 2006) change significantly in various domains. For example in business and economics,and modified a service is considered the non-material equivalent of good. In computer science, services are perceived more technically. Roughly speaking, services are considered logical groupings of operations, which in turn are logical units of work invoked by service consumers at run‑time.

requires

0..* Service 0..*

requires

0..* 0..* Operation 1 Interface Software service 1..* 1..* 1 1..*

1..* Implementation

Figure 2.4: Software service-related concepts (Allen, 2006)

For a deeper understanding, one should differentiate several distinctive aspects, when considering a software service (OASIS, 2006). A service itself is determined by a set of static features: • service description; • execution context of the service; • policies, applied to the service; • contracts between service and its consumers. And from the dynamic perspective, the concepts involved in interacting with a software service are: • visibility between service and its consumers; • interaction between service and its consumers; • real world effect of interacting with a service.

If not explicitly stated otherwise, the term “service” is perceived according to the following formal definition in the ongoing parts of this thesis:

“A service is a mechanism to enable access to one or more capabilities, where the access is provided using a prescribed interface and is exercised consistent with constraints and policies as specified by the service description.” (OASIS, 2006, p.12)

- 21 - 2.2.2 Specific issues

Service-orientation brings specific issues for both service providers and service consumers that have to be addressed during designing of a service-based application. A typical scenario of activities involving service provider, customer, supplier and user in an open environment is shown in Figure 2.5. In real settings, individual activities can be more or less explicit and the scenario can also deviate from the depicted one to some extent. Typically, the selection and possibly also the discovery phases can reoccur, if contracting fails.

2. discovery & selection Service 1. publication Registry

3. contracting

Service Service Consumer Provider

SLA

Service 6. monitoring Service User Supplier

4. invocation 5. execution

Service

Figure 2.5: Typical scenario of interactions between service provider, consumer, supplier and user

Under specific circumstances, some activities can be omitted. For example when a software system is built using a stable set of services knowing their communication partners. Such systems are called software confederations (Král, 2005) and there is usually no need for service publication, discovery or selection. The opposite case, when a system contains no fixed set of services and service discovery is accomplished at runtime, is called a software alliance.

2.2.3 The way towards service-orientation

Service-orientation has not emerged from nothing. It is just a next step in the continuous evolution of software system design paradigms. As such, it naturally builds upon its predecessors, keeping the proven and successful aspects, trying to improve the weak points, exploiting recent innovations in technology and wrestling with new requirements.

To understand the rationale leading to the rise of service-orientation in the last decade, it is important to know the foregoing paradigms. This section presents a very brief overview of recent approaches to designing large and possibly distributed software systems, which can be considered direct predecessors of service-orientation.

2.2.3.1 Structured design and Remote procedure calling

In 1970s, software systems were getting larger and more complex. But there was little guidance on good design and programming techniques, and there were no standard techniques for documenting

- 22 - requirements and designs. Structured methods, developed by pioneers of software engineering such as Dijkstra, Constantine, Yourdon, Stevens, DeMarco or Parnas, were an answer to this situation. Structured programming gave rise to structured design, which in turn gave rise to structured system analysis. Everlasting principles of good design such as coupling, cohesion, encapsulation and modularity were introduced in this era as well as the first widely-accepted graphical notations for documenting system design such as data flow diagram or entity relationship diagram.

Structured design separates the system in question into a hierarchy of modules with procedures, possibly sharing data with other (sub)modules. The structuring of modules is based on a top-down functional decomposition, a predominant abstraction mechanism used in structured analysis (DeMarco, 1979). Well-designed modules can possibly be reused in multiple scenarios or even in multiple systems.

At approximately the same time, the ARPANET network, a direct predecessor of the Internet, enabled a reliable communication among geographically distributed computers. The very first ARPANET link was established in 1969. In 1976, the first protocol for remote procedure calling (RPC) was defined as a part of RFC 707 (1976), aiming to facilitate a network-based resource sharing in ARPANET and effectively starting the era of large-scale distributed applications.

RPC is neither a specific technology nor a communication protocol. It is just a principle. Generally, RPC allows a computer program to cause a procedure to execute in another address space, thus facilitating both local and a network-wide inter-process communication. Technically, RPC is accomplished by means of . Therefore, any RPC implementation has to define a communication protocol for sending messages between client (procedure invoker) and server (procedure executor) as well as a message format, including of parameters of various data types. From a programmer’s point of view, the location of a procedure execution should be completely transparent, i.e. there should be no difference in invoking a local intra-process, local inter-process and remote execution of the procedure. An RPC implementation should therefore include also a run-time environment, providing for an execution location transparency and for both synchronous and asynchronous communication between client and server. It is important to note that remote execution can be very useful, but it has also some inherent issues, not occurring in local procedure calls. Communication over the network causes a significant performance overhead, additional security issues and represents an additional potential point of failure.

Messages are passed between client and server in accordance with specified message exchange pattern (MEP). A MEP describes the sequence of messages sent between client and server during a single procedure call. Most RPC implementations support In-only and In-Out MEPs (see Figure 2.6).

Figure 2.6: In-only and In-Out message exchange patterns

- 23 - The RPC principle has been implemented in many different and often mutually incompatible ways over the years (e.g. in ONC RPC, NCS, DCE/RPC or MSRPC). To achieve better interoperability, specifications were developed such as XML-RPC (1999), sending XML-encoded messages over HTTP protocol, or JSON-RPC (2005), sending JSON-encoded messages over HTTP protocol. Even if these specifications are not approved by any official standardization body, they are widely recognised and one can find them implemented in almost every common programming language.

2.2.3.2 Object-oriented design and Distributed objects

The object-oriented (OO) programming paradigm was introduced in the Simula programming language in 1967. However, it lasted almost 20 years until a sound theory and methodologies for OO development were developed in 1980s by Booch, Beck, Coad, Firesmith, Jacobson, Mellor, Meyer, Rumbaugh et al.

The central point of object-orientation is the concept of an object, a cohesive unit of encapsulated data and methods that act on the data. Such an encapsulation provides for an easier accommodation to local changes, which turned out to be a weakness of the structured design. One of the many existing definitions of the term “object” states that:

“An object is a discrete entity with a well-defined boundary that encapsulates state and behaviour; an instance of a class.” (Rumbaugh et al., 2004, p.482)

Object-oriented design views a software system as a collection of objects being created and destroyed during the system runtime. System behaviour is generated by mutual interactions between the objects, accomplished by means of message passing. The main conceptual distinction in design from the structured approach described earlier is in putting the data and methods together. Functional decomposition of the structured analysis is replaced by OO decomposition, deriving the proper structure of classes and objects from real-world entities, operations and responsibilities from the problem domain. Objects and classes can thus possibly be reused when there is an intersection in problem domains of multiple systems.

Clearly defined interfaces and communication by means of message passing make objects good candidates for being accessed remotely. In OO context, this is usually called a remote method invocation (RMI). The basic intent of RMI is analogous to that of RPC. It allows applications to interact with remote objects, residing anywhere on the network, exactly the same way as it does with local objects. However, some distributed object schemes offer additional features, such as the ability for an agent to create a new object on another host or even to construct an object on one host and transmit it to another one.

Middleware systems allowing the distribution of objects were developed to a much more elaborated status than their RPC-enabling counterparts. They are not limited to specifying a communication protocol and a message format. A general architecture of such a system is shown in Figure 2.7. It includes a means of specifying a remote object interface, a compiler for generating both server-side (“skeleton”) and client- side (“stub”) proxy objects out of the specified interface and several run-time components, including: • registration service, which registers a skeleton implementation with the naming service and the object manager and then stores it in the object skeletons storage; • object skeletons storage, which stores registered skeleton implementations; • object storage, which stores existing objects (instances of skeleton implementations); • object manager, a central component responsible for creating and destroying the objects, routing method invocations to proper objects and managing the object storage; • naming service, a centralised service responsible for routing client requests to the proper object manager.

- 24 - Figure 2.7: General architecture for distributed object systems (Farley, 1998)

In 1989, eleven companies founded the Object Management Group (OMG) consortium, aiming to set standards for distributed object-oriented systems. Two years later, the first version of the CORBA (Common Object Request Broker Architecture) specification was released, providing a language- and platform-neutral platform for building distributed object-oriented applications. Unfortunately, the first versions of the specification were not specific enough to ensure interoperability of its implementations. Ongoing versions of the specification fixed this issue to large extent, but as the specification evolved – there were eleven major releases within the years 1991 and 2002 – it became extremely complex, hard to implement and often ambiguous. Despite it, there are currently many CORBA implementations available and many distributed systems were built upon it successfully.

In addition to CORBA, there are currently many simpler, language-specific and mutually incompatible implementations of the distributed objects principle available. One can find a specific implementation for almost every common object-oriented programming language (RMI for Java, DDObjects for Delphi, Pyro for Python, DRb for Ruby, DCOM and later .NET Remoting for the Microsoft platform,…).

The described model for distributing objects is the basic and most common one, but definitely not the only one. Other models for distributing objects include at least the following (Krakowiak, 2009): • fragmented objects, where an object may be split into several parts residing on different hosts; • replicated objects, where multiple copies of the same object may coexist; • migratory (or mobile) objects, where an object may move from one host to another one; • live distributed objects, a generalization of the replicated objects principle, where object replicas may achieve just a weak consistency between their local states.

2.2.3.3 Component-oriented design

The idea of composing software systems out of prefabricated binary components is very old. Its origins can be traced back at least to 1968 (McIlroy, 1968). Even the first simplified implementation of this principle, the “pipes and filters” mechanism, appeared in the Unix operating system quite a long time ago, in 1973 (LINFO, 2004). But as with object orientation, it lasted almost 30 years since it came into common practice in the area of enterprise software systems.

- 25 - The central point of component-orientation is the concept of a component, which has many things in common with the concept of an object as described in the previous section. Components follow the fundamental OO principles of combining data and functionality in a single unit, encapsulation and identity (Cheesman, 2001). But there are also clear differences between the two, some of them following directly from the definition:

“A software component is a unit of composition with contractually specified interfaces and explicit context dependencies only. A software component can be deployed independently and is subject to composition by third parties.” (Szyperski, 1997)

One of the biggest differences between components and objects is that components have their interface and implementation explicitly separated. Components interact strictly via their well-defined interfaces, which results in an absolute, “black-box” encapsulation and allows interface-based system design. Just a binary compatibility is required for the components to work together. This allows a component composition to be performed by a third party familiar with the component interface(s) but not knowing anything about its internals. As long as the component interface is not broken, components can be also easily substituted. The substitution can take place even at run-time, without recompilation or redeployment of the other components. This makes well-designed component-based systems more adaptable to changes or evolution in requirements, which OO technologies were not able to cope with. Component-based systems are better suited for continuous software evolution (Szyperski, 1999).

Although it is not a strict rule, component interfaces tend to be more coarse-grained when compared to interfaces in the OO design. There are good reasons for this. Firstly, because of a distributed nature of components, an invocation of a component method is typically much slower than an invocation of an object method so there is a natural tendency to minimize the number of component method invocations. Secondly, as a general rule, coarse-grained interfaces promote loose coupling (Beiseigel et al., 2007), which fits nicely into the idea of component-orientation. Finally, it is a good idea to design components to be stateless where possible. Stateless components facilitate scalability, but force the client to (re) submit all relevant data with each method call which again favours a lower number of component method invocations.

A component-based application comprises a collection of mutually interacting components, independently working binary modules. This is another difference from the OO paradigm, where an application is typically eventually compiled into a single, monolithic binary. Components are units of independent deployment and versioning. They are usually deployed into a run-time environment, providing infrastructural services such as pooling, transactions or security. This allows component developers to concentrate on the application logic of the component. In contrast with the previously described approaches, distribution of components and remote deployment thereof are no longer anything special. They have become inherent features of the concept of component-orientation.

Providing a complete component-enabling platform is a very demanding task. This is why there are currently only two mature solutions competing for the market, both being supported by major software companies. The older one is the Enterprise JavaBeans (EJB) technology, having its roots in the specification developed by IBM in 1997. Microsoft responded with the COM+ technology, a direct successor of DCOM, in 2000. Although the original idea of building systems out of prefabricated binary software components never came into real practice, both these competing technologies are being developed and widely used up to these days, just the COM+ was surpassed by the Windows Communication Foundation (WCF) over the years.

- 26 - 2.2.3.4 Service-oriented design

The notion of service-orientation and the idea of building complex systems out of loosely coupled standalone software services with well defined interfaces have been popping up in computer science for decades. In the late 1990s, this principle came up again labelled “Service-oriented architecture” or simply SOA (Schulte, 1996) and has gained a significant attention of both academic and commercial audience. Introduction of the “Web Services” (WS) stack around the year 2000 increased the attention even further. Since then, services have become a mainstream philosophical approach especially for designing large-scale, distributed enterprise applications.

Services are in many aspects similar to components as described in the previous section. Applications are built as sets of services, individual services have well-defined interfaces and communicate strictly through these interfaces. Services exist as a single instances, whose life cycle is not controlled by the caller and a typical service invocation is stateless. However, services are usually built using platform- neutral and firewall-passing technologies, which facilitates decoupling service consumers from service providers and makes services more language- and locality-agnostic. Another major difference between components and services is their binding mechanism. Components assume early binding, i.e. a caller knows exactly which component to contact before runtime. Unlike components, services are discoverable entities and service-based systems can therefore adopt a more flexible approach, where the binding is deferred to runtime. This makes it possible for example to take also consumer’s non- functional requirements into account when choosing the right service to contact.

But what makes services really different from components is the way a system is decomposed into individual services. Components are usually designed bottom-up to encapsulate the computational details of some concept. In contract, service-oriented design aligns methods provided by service interfaces with real-world business-driven needs, aiming to separate a need itself from a need-fulfilment mechanism (Elfatatry, 2007). Association of methods in the service interface is purely logical and their implementations can be in fact completely independent.

The proliferation of the Internet in the last decade allowed widening the scope of service-based applications. They are no longer just a matter of enterprise-scale systems. Nowadays, there is an endless amount of services available, providing various functionality to virtually anybody through the Internet. And it looks like there is still a potential for further development of services and service-based solutions. Recently, the vision of the Internet of Services (Schroth, 2007) emerged as a possible future direction of service evolution, promoting services as tradable goods and Internet as a medium for offering and selling these services.

2.2.3.5 Lessons learned

The last four sections briefly reviewed the history of designing complex software systems. Starting with structured design, over object- and component-oriented design up to the latest addition, a service-oriented design. Looking at the background of this evolution, one can make several fundamental observations: • Even if an increased productivity due to code reuse is often mentioned as the motivation for introducing new approaches to software development, dealing with change seems to be in fact the most important driving force. • There is a continuous shift towards decentralization, dynamic solutions, higher-level abstractions and coarser-grained constructs, allowing one to get the software-provided functionality closer to real business needs.

- 27 - • New approaches to software design and development usually do not supersede the existing ones totally. Rather they limit the usage of the older approaches to the areas and abstraction levels which suit them best and where they still remain to be the best solution. This introduces a layered application design and development. Structured programming has its place in low-level tasks such as an operating system core or hardware drivers. Objects and classes are a good basis for application-level development and for creating components which in turn are suitable for enabling services. However, the fundamental conceptual differences between the three have to be still kept in mind. “Not every good component transformed into a service makes a good service” (Brown et al, 2002).

2.3 Service-oriented architectures

Section 2.2 introduced the concepts and ideas of service-orientation in their most general, philosophical sense. Service orientation was also presented as the last evolutionary step in designing large software systems and contrasted with the previous approaches. This section builds on the previous one and dives into the details of the concept of a service-oriented architecture (SOA), bringing the ideas of service- orientation into practice especially in the context of enterprise-level information systems, supporting the business of particular organization.

A natural first step in introducing a new concept is presenting and analysing a formal definition thereof. However, for SOA it is not an easy task. It is not hard to find a definition of SOA. The problem is that there are in fact dozens of definitions available provided by both industry and academia, corporations and individuals. To illustrate the diversity in seeing SOA, several often cited definitions follow, sorted according to the time of publication:

“A set of components which can be invoked, and whose interface descriptions can be published and discovered.” (W3C, 2004b)

“In Service-Oriented Architecture autonomous, loosely-coupled and coarse-grained services with well-defined interfaces provide business functionality and can be discovered and accessed through a supportive infrastructure. This allows internal and external system integration as well as the flexible reuse of application logic through the composition of services to support an end-to-end business process.” (McKendrick, 2005)

“Service-Oriented Architecture (SOA) is an architectural style that supports service orientation.” (The Open Group, 2006)

“Service Oriented Architecture is a paradigm for organizing and utilizing distributed capabilities that may be under the control of different ownership domains. It provides a uniform means to offer, discover, interact with and use capabilities to produce desired effects consistent with measurable preconditions and expectations.” (OASIS, 2006, p.29)

“SOA is a business agility strategy” (Soley, 2008)

Although there is just a four-year difference between the oldest and the newest definition, one can see an obvious evolution in perceiving SOA during that time. The oldest definitions tend to be quite technology- oriented which is not surprising since SOA was born in the field of information technologies. Later on, SOA started to be considered a high-level principle for designing software architectures. And lastly,

- 28 - SOA was taken out of the scope of information technologies and presented as a business-level concept covering the whole enterprise strategy. This progress in defining SOA does not mean that SOA has no longer any technological connotations. The presented definitions do not supersede one another. They are just presenting SOA from various points of view and to get a complete image, one should best consider their union, not intersection or the latest one.

SOA can still be beneficial when being used just as a novel approach to software development. However, its potential scope is much wider. The idea is that software services and processes should be aligned with real business services and processes in a SOA-adopting company. This way, SOA adoption has a strong impact not only on technological solutions, but also on business processes, roles, responsibilities, business models, management etc. However, this thesis is concerned primarily with the technological aspects of SOA, so other dimensions will be just briefly sketched without providing many details. Just to give reader a complete overview of the SOA concept and its implications.

2.3.1 Principles

SOA, when being taken as a software architectural style supporting service-orientation, is based on several rules or design principles, which differentiate it from other approaches. Namely (Erl, 2007): • standardized service contracts; • service loose coupling; • service abstraction; • service reusability; • service autonomy; • service statelessness; • service discoverability; • service composability.

Service interoperability can be considered another key SOA principle, but it is intentionally not included in the list because it is fundamental to any kind of service-oriented computing. If services were not interoperable, all the mentioned principles would be either broken or useless.

These principles will be elaborated independently in the following sections together with their implications. For now, just note that almost every one of the principles introduces new considerations and design requirements that eventually increase the cost and effort to deliver the solution logic. It is also important to point out that these principles are from many points of view antagonistic to some extent. For example loose coupling and abstraction call for information hiding, whereas for reusability, composability and discoverability the more information is available the better. Similarly, loose coupling promotes coarse-grained interfaces, but reusability is easier to achieve with fine-grained interfaces. Therefore, no single principle can be taken as a directive one. The art of designing a SOA is in fact in achieving a proper balance among all these principles.

2.3.1.1 Standardized service contracts

Service contracts are all about metadata. Describing a service is a twofold issue. There should be an SLA, a human-targeted document, focused especially on non-functional characteristics of the service in question. An SLA may or may not contain also some technical details. But this section is concerned with another type of service contract – a formal, purely technical and machine-consumable one, which can be processed by potential service consumers even at runtime.

- 29 - Technical service contracts usually comprise several service description documents, each focused on a specific topic. Generally, they should provide a formal definition of the following (Erl, 2005): • the service endpoint; • each service operation; • every input and output message supported by each operation, including the data representation model of each message’s contents; • rules and characteristics of the service and its operations.

Currently, there are widely recognised standards available for ensuring a syntactic interoperability of services (for details see Section 2.3.2). Syntactic interoperability is necessary, but not sufficient. Ideally, service contracts should provide also some kind of semantic information about the service and the functionality it provides to ensure a semantic interoperability. However, the corresponding technologies are not yet mature enough to reach the SOA market (Bhiri et al., 2009). Limited semantic interoperability is therefore currently achieved just by means of naming conventions and shared vocabularies.

Anyway, having a managed set of guidelines for service contract authors, such as naming conventions, shared vocabularies and best practices, is an advisable practice. It facilitates transparent service design and consequently also service reuse. For example a data representation of shared entities should be shared among operations and/or services to preclude inconsistency.

Generally, the “contract first” approach should be used when developing services to avoid interoperability issues. Although current IDEs are usually equipped with the contract generation facility, interoperability issues are unpleasantly common in such cases, it may be hard or impossible to meet the established conventions and it may be hard to preserve the constructs, tailored for a specific programming language, when replacing the service implementation.

A typical service contract-related issue is its versioning. Service consumers are tightly bound to the contract and any backward incompatible change can break existing applications. Fortunately, if the contract was designed properly, i.e. for performing a general business-driven task without having a specific usage context in mind, changes should not be necessary too often.

2.3.1.2 Service loose coupling

Prior to advocating loose coupling, we need to explain the concept of coupling itself. Coupling can be defined as a measure of the relative interdependence among software modules (Pressman, 2001). The higher the dependency, the tighter the coupling. To some extent, the level of coupling can be calculated and enumerated using formal metrics. See (Poshyvanyk, 2006) for an overview of such metrics. Coupling can be either unidirectional or bidirectional.

It is important to note that coupling is generally unavoidable. Any communication or cooperation between two software units involves some level of coupling. Recognizing and differentiating between tight and loose coupling is not a straightforward task, but several general guidelines exist. For example message- based systems generally expose lower level of coupling than API-based systems. Exchanging data via documents also promotes loose coupling between consumers and providers, because a document is a canonical, programming language-independent form of the data. On the other hand, protocol or platform dependencies increase coupling.

Dealing with coupling is a matter of both choice of technology and careful designing. Historically, tight coupling was very common, even if in some specific areas the opposite was true (e.g. hardware drivers or

- 30 - accessing databases). But tight coupling affects the ability to evolve for both parties and a change in one module often forces a ripple effect of changes in other modules. In order to solve these issues, loosely coupled system designs gained more and more importance over the time. Loose coupling enables agility, but it has its drawbacks too. Decoupling is often achieved by introducing an intermediary at a new level of abstraction, which makes system design more complicated. It also usually increases runtime overhead, because it dictates the usage of general, i.e. non-optimized solutions. In addition, some tasks are more complicated to achieve in loosely coupled systems, such as providing transactional integrity.

In the context of software services, coupling expresses the impact of a change in a service on service consumers. Erl (2007) differentiates seven types of service-related coupling: • logic-to-contract, i.e. the coupling of service logic to the service contract; • contract-to-logic, i.e. the coupling of the service contract to its logic; • contract-to-technology, i.e. the coupling of the service contract to its underlying technology; • contract-to-functional, i.e. the coupling of the service contract to external logic; • contract-to-implementation, i.e. the coupling of the service contract to its implementation environment; • consumer-to-implementation, i.e. the coupling of the service consumer to any behaviour of the service not defined by the service contract; • consumer-to-contract.

From these types, all but the first and the last ones have to be considered negative and should be avoided whenever possible. Loose coupling is achieved through service contracts and those should be designed very carefully not to introduce any unnecessary dependencies. Note that badly designed contracts can cause indirect coupling. For example, coupling service contract to service implementation can cause an indirect coupling of service consumers to service implementation.

The level of coupling is influenced even by service granularity. Fine-grained operations increase coupling because a client needs to invoke multiple operations in specific sequence to achieve a reasonable result. On the other hand, fine-grained operations limit unnecessary computations and are more amenable to be reused. As always, a proper balance has to be found, which is further complicated by the fact that granularity is a relative concept that can be precisely defined only in a specific context.

2.3.1.3 Service abstraction

Abstraction is all about hiding nonessential information. A proper level of abstraction allows us to deal with matters without being overloaded by all their subtleties. Being a design-time issue, abstraction is difficult to change once established. This is important especially when considering that an improper level of abstraction, both too high and too low, can seriously affect the reusability of the designed software artefacts.

In service-oriented architectures, a higher level of abstraction is generally used than it used to be customary in the previous architectural styles. Service consumer developers need to have all the information needed for using the service. But on the other hand, the more information is available to them, the more consumer-to-implementation coupling can possibly occur. And it does not include just the information available in service contracts. Even personally available pieces of information about service implementation details can unintentionally cause that. Service contracts designed with a proper level of abstraction in mind should provide all the information needed and at the same time stay fairly stable, allowing each side of the conversation to change and/or evolve without impacting the other.

- 31 - The abstraction principle limits the tendency of other principles, namely reusability, composability and discoverability, to make service contracts very specific. When designing a service contract, one should carefully consider the appropriate level of abstraction in the following fields: • technological details (language, platform, location, ...); • functionality (what parts of the encapsulated functionality should be published); • internal logic (algorithms, exception handling, ...); • quality of service (what QoS indicators should be published).

2.3.1.4 Service reusability

Achieving a reasonable level of software reusability has always been a holy grail of software architects, designers and developers. Reusability reduces redundancy and thereby also development and maintenance efforts. But developing multipurpose software assets is a long-term investment. It requires an extra effort, which is counter-agile in initial development phases.

Reusable software units are generally more complicated to design, run and evolve. Their designing is affected by the need to keep them context-agnostic and to predict possible usage scenarios, including various security settings, concurrency scenarios etc. Unclear allocation of reusable assets may even require changes in project governance, such as creating a specialised team, responsible for their development and maintenance. Running reusable software assets requires scalable environment. Evolution and versioning is complicated by a high number of dependencies. Reusable code also often suffers from additional computational overhead when compared with single-purpose assets. All these drawbacks can be overweighted by the positives only if reuse is really reached. Well designed and developed reusable software modules can be of immense value. But on the other hand, they constitute a potential single point of failure and performance bottleneck.

In the context of software services, reusability is the ability of a service to participate in multiple service compositions (Feuerlicht, 2007). Achieving service reusability is a matter of both selection of technologies and careful designing. Services built using widely adopted standardized technologies have better potential to find their consumers as well as services encapsulating a piece of functionality which can be reused across multiple business processes. As already mentioned, reusability is strongly related to granularity. Fine-grained services have higher potential to reuse than coarse-grained ones because of their narrower focus. Layered service design (see Figure 2.8) takes this fact into account and facilitates service reuse by separating the low-level foundation and utility services, which have naturally the highest potential to reuse. On the other hand, some very specific business services may not be reused at all.

2.3.1.5 Service autonomy

Autonomy, in relation to software, represents the ability of the program to self-govern and to act independently. Increased autonomy leads to higher reliability and behaviour predictability due to lower occurrence of unpredictable outside influences and easier dealing with performance issues. Autonomy is both a design-time and a post-implementation, or run-time issue. Unclear boundaries or overlapping responsibility among individual modules affect their autonomy by design. Higher design-time autonomy allows a higher attainable level of runtime autonomy, which is determined by the level of separation of individual execution environments. Full autonomy is not attainable in most real-world scenarios. Full physical isolation of execution environments requires extra investments in hardware, software and infrastructure as well as an extra effort in designing and run-time administration. Legacy systems also often limit the attainable level of autonomy.

- 32 - Reliability and predictability are important especially for software assets which are meant to be reused and composed, which is the case of services. Naturally, a high level of autonomy can be reached more easily for lower-level services (utility, entity). At the same time, the benefits of increased autonomy are highest for these types of services, because of their high potential for reuse and composition. The higher a service is in a service hierarchy, the less autonomy it tends to have due to dependencies on other, potentially composed, services. Autonomy of a composed service is determined by the collective autonomy of all the services that participate in the composition.

For services to provide reliable, predictable performance, they need to have a significant degree of control over their underlying execution environment and resources. This is problematic especially in services, encapsulating legacy systems. Their attainable level of autonomy is limited by the level of autonomy of the encapsulated system itself, which is usually out of the control of service designers and developers.

2.3.1.6 Service statelessness

State information is temporary data specific to a current activity. Performing any meaningful activity requires some kind of state information. Typically, this includes (Erl, 2007): • session data, retained between continuous interactions (“conversation”) with a single client; • context data, passed between parties involved in realizing the activity; • business data, relevant to the currently executing business task.

State management can limit scalability and availability of a particular architecture seriously. In attempts to solve this issue, past technologies and architectural styles have shifted the responsibility for state management throughout various client and server tiers. As a general rule, software assets such as services, which are meant to be accessed heavily and concurrently, to be reusable and composable, should minimize the amount of consumed resources by minimizing the amount of state information they manage and the duration for which they hold it.

Statelessness can be achieved at a cost of increased message payload, decreased runtime performance or both. Client applications can include all necessary information in every input message, which makes message payloads larger and consequently message creation, transmission and parsing more demanding. Another option is to employ some state-deferral technique such as persistence and activation/passivation mechanisms, which can minimize resource-locking but imposes additional overhead necessary for both deferring and retrieving state information. Dependency on a specific state-deferral mechanism can affect autonomy of a particular service.

Full statelessness is not always achievable. Typically for long-running business processes, including interactions with other systems or even human interventions. Still, designing services as stateless is a preferable solution where possible.

2.3.1.7 Service discoverability

Discovery is a process of searching for and locating something. Discoverability, on the other hand, is a measure of the ability of a given item to be effectively discovered when needed. Generally, discoverability encompasses the following issues to be solved: • choosing a metadata format and required/optional metadata items for describing the items; • running a centralised searchable metadata storage and management environment; • creating adequate metadata for individual discoverable items;

- 33 - • making the created metadata available in the environment; • ensuring the correct interpretation of metadata by potential clients, ideally both human and automated ones (typically by establishing a shared vocabulary).

Discovery of software services helps to avoid accidental creation of redundant services or services that implement redundant logic. To be discoverable, services should be equipped with sufficient functional and QoS metadata to properly communicate its purpose, capabilities and limitations. The metadata need to describe not only the service as a whole, but also its individual operations. Standardized service contracts can be considered the most basic metadata items.

Discovery can take place at design-time, usually by humans, or at run-time by accessing an automated registry without human intervention. Run-time discovery is still not commonly used in large scale due to interpretation issues, caused by the lack of semantics, and due to trust issues. To facilitate service discovery, SOA environment can provide an automated discovery mechanism, such as a service registry or directory. But even if it is not the case, e.g. when there are not enough services to justify it, making services discoverable still facilitates their reuse and is worth considering.

2.3.1.8 Service composability

A service can possibly represent any range of logic. But complex functionality can and should be provided by composing multiple services. And the other way round, services should be designed so that they can participate in service compositions even if there is no immediate composition requirement at the moment. This holds also for services which themselves are compositions of other services. This way, hierarchies of services are formed. Rosen et al. (2008) identified the general service hierarchy that should be present in any complex service-based enterprise environment (see Figure 2.8): • business services, exposing high-level business functions to the enterprise; • domain services, providing business-related services, specific to a single business domain (e.g. membership validation); • utility services, providing common functionality across the enterprise (e.g. address book); • integration services, exposing existing applications as services; • external services, providing access to applications external to the enterprise (e.g. credit card validation); • foundation services, providing fine-grained, technical services (e.g. logging).

Figure 2.8: Service hierarchy (Rosen et al., 2008)

- 34 - A service composition consists of composition member(s) and a composition controller, i.e. a service, orchestrating the invocations of composition members. Note that a composition controller can at the same time play a role of a composition member in higher level compositions. Client applications, performing some specific logic by successive invocations of multiple services, can be also seen as a kind of composition controllers. However, it is not an advisable practice to take parts of application logic out of the SOA environment that way. Client applications should play just the role of composition initiators, i.e. they should just initiate the logic encapsulated in a service composition.

Composability can be seen as a form of reuse. Therefore, all reuse-enabling aspects such as standardized service contracts or appropriate service granularity are equally relevant also for achieving good service composability. And like with any other reusable assets, service compositions impose high requirements on the performance and reliability of encapsulated services. A single low-performing, malfunctioning or unavailable composition member can adversely affect the whole composition.

Because of their internal disintegration, service compositions often utilize some transaction management mechanism, allowing them to expose an atomic behaviour. However, like in any other distributed computing environment, fault handling is not a straightforward task in composite services. Traditionally, ensuring the atomicity, consistency, isolation and durability (ACID) properties in multiparty transactions is achieved by employing the two-phase commit protocol (Bernstein et al., 1987). But in service-oriented scenarios it is often not applicable. Firstly, it requires all the parties to support transactional behaviour, which is not always the case. Secondly, it assumes existence of a single transactional coordinator, which is also not always feasible. And finally, achieving the isolation property limits the participants in processing any other tasks until the transaction finishes, which can be very problematic in long- running processes. For all these reasons, fault handling in service-oriented scenarios is usually done using compensation, i.e. by invoking a business logic, which cancels out the effects of the previously failed transaction. Although compensation does not provide true transactionality – isolation is missing – it does not suffer from any of the listed issues and is sufficient in many cases.

The importance of service composition as a technique for providing new functionality grows together with the number of services already built. Some advanced development environments such as Triana (Majithia et al., 2004) or the CASA editor for NetBeans IDE, allow creating service compositions just by using an intuitive graphical notation without writing any code directly.

2.3.2 Technologies

In principle, SOA is not bound to any particular technology. This is essential especially for implementation details of both services and client applications. However, some common mechanisms still have to be established, enabling the basic service-related activities as described in Section 2.2.2 (e.g. service discovery and service invocation). This section reviews the implementation-agnostic technologies commonly used for this purposes in traditional large-scale SOAs together with some relevant Java- specific technologies.

2.3.2.1 Data formats and their handling

Data formats are crucial for ensuring service interoperability. By choosing an inappropriate data format or a data format, potential service consumers are not familiar with, service providers can reduce the number of eventual service consumers.

- 35 - Extensible Markup Language (XML)

XML (W3C, 1998) is a textual format for encoding electronic documents, developed and further maintained by the W3C Consortium. Technologies for both reading and writing XML-formatted documents are available for all the common platforms and programming languages.

XML is a universal data format and as such it inevitably has also some weaknesses. Under specific circumstances, more suitable data representation formats can be found. The most often mentioned issues are verbosity, absence of data types and semantics, difficult representation of non-hierarchical structures, unsuitability for large binary content and processing overhead (Bradley, 2006). However, the universality, standardization and ubiquitous support of XML has overweighted all these known issues and XML has become a de-facto standard for data representation and interchange if technology independence is a concern.

XML data binding

The ultimate aim of SOA is provision of high-level, business-aligned software services. Such coarse- grained services naturally tend to exchange complex business-related data items with their clients. Low-level handling with such data items is certainly possible, but time-consuming, error-prone and nontransparent. This is where higher-level data-handling mechanisms usually come into play. XML data binding techniques can represent XML documents as language-native objects equipped with appropriate methods, allowing developers to deal with the data in a more comfortable way, even if XML is still used in the background.

Various language-specific techniques were developed for solving this issue, e.g. Java Architecture for XML Binding (JAXB), allowing one to generate Java classes out of an XML Schema instance and then to convert an XML document into corresponding object graph and vice versa easily (Ort, 2003). Similar technologies exist for other platforms too.

Figure 2.9: Java architecture for XML binding (Ort, 2003)

Service Data Objects (SDO)

In 2003, IBM and BEA Systems developed the first version of the Service Data Objects specification, which goes behind the concept of XML data binding (IBM, 2003). SDO is a universal, language-neutral data abstraction layer. Data is organised as a graph of objects, composed of properties of either primitive

- 36 - or object-level value types. Object graphs can be accessed and manipulated using both static, i.e. strongly typed, developer-friendly, and dynamic, i.e. loosely typed, metadata-driven way.

Similar to JAXB, SDO allows developers to specify the object structure using XML Schema, to populate the object graph from XML and to serialize the object graph back to XML. But SDO can offer even more to SOA. It can abstract data sources. It comes with a runtime engine, a Data Access Service (DAS), which acts as an intermediary between applications and arbitrary types of data sources. DAS is responsible for SDOretrieving data from the data source, transformation of the data to/from object graphs and for updating the data source when a client application asks it to. As a result, applications – services in the SOA context – work just with SDOs and do not access data sources directly. This takes away a possible source of negative coupling. What is more, SDO uses a disconnected data access pattern with optimisticSDO concurrency Data Transfercontrol model, Objects which suits well(DTO’s) to the needs of service-based applications.

RDB

Data Graph DataObject JDBC XML DB

XPath / XQuery Client : EJB: Data Access InvoiceEJB: Service Local Customer

XML/HTTP

Change Web service Summary CCI / Proprietary

JCA

Figure 2.10: Service Data Objects and Data Access Service (Brodsky, 2006) SDO Collaboration Slide 9 SDO seems to be the most promising approach to universal data handling and an emerging industry standard, supported by major software vendors (IBM, Oracle, SAP, Sybase, etc). The only exception is Microsoft, promoting its own ADO.NET technology. Since 2007, the SDO specification is maintained by the OASIS consortium, which further supports its significance. Implementations currently exist for Java, C, C++, PHP and COBOL and other languages are planned including the Microsoft .NET platform.

2.3.2.2 Service foundations

The advent of service-orientation was connected with the introduction of a well-known and widely- supported set of specifications, usually referred to collectively as the “Web services”. The relationship between the two is important and they are mutually influential up to these days.

“Web services momentum will bring SOA to mainstream users, and the best-practice architecture of SOA will help make Web services initiatives successful.” (Natis, 2003)

Although SOA is an extremely wide concept, overreaching the sole design and development of IT services, the Web Services technological stack covers the fundamental technological SOA aspects –

- 37 - service invocation, service description and service discovery above all. As a result, Web Services are referred to as the most widespread application of the SOA concept (Papazoglou, 2008).

SOAP (Simple Object Access Protocol) defines an XML-based format for messages sent between a service provider and service consumer(s). Even if it is not the only possibility, by far the most frequently used transport protocol for SOAP messages is HTTP(S). The specification was originally developed in 1998 by Microsoft. In 2001 it was submitted to W3C and endorsed as a W3C recommendation two years later (W3C, 2003).

WSDL (Web Services Description Language) provides a formal model and an XML-based notation for describing web services. WSDL description is the only thing required to build a web service client. Although other schema languages can be used too, XML Schema is recommended and used by far the most frequently for specifying the types of parameters and returned values of individual service operations. The specification was originally developed by IBM, Microsoft and Ariba in 2000. Later on, it was submitted to W3C too. Version 2.0, approved as W3C recommendation in 2007 (W3C, 2007), is at the time of writing still lacking support of major runtime and development environments (Java, .NET, PHP, Ruby,…).

UDDI (Universal Description Discovery and Integration) is a specification for setting up an XML- based service registry, facilitating its publication and discovery. The original version of the specification was developed in 2000 by the UDDI consortium and since 2003 it is managed and further developed by the OASIS consortium (OASIS, 2004). A network of publicly available UDDI registries was established by major software companies (Microsoft, IBM, SAP,…) in 2000 within the “Universal Business Registry” project. After five years of running, the registries were shut down with mixed results. The technological feasibility of the idea was proven, but it also uncovered many serious usability issues such as offering irrelevant or outdated entries and – above all – reluctance of business to establishing business relationships without human participation (Krill, 2005). As a result, latest UDDI specification versions came with features useful for building enterprise-wide service registries.

Figure 2.11: Web services architecture stack (W3C, 2004a)

- 38 - These three specifications together with the HTTP protocol, the XML language and the XML Schema specification represent the most commonly used basis of the Web Services technology. Even though all the above-mentioned specifications were designed so that they could be used together, they exist in multiple versions and are vague at times or permit problematic constructs. This can affect the interoperability of a service. To solve these issues, the “Web Services Interoperability Organization” (WS-I) publishes so called “profiles” – instructions for using specific versions of specifications and imposing additional restrictions. By complying with a WS-I profile, one can achieve a better service interoperability.

In addition to the core WS specifications, there are dozens of so-called WS-* specifications, in most cases maintained by working groups of W3C or OASIS consortia. For a comprehensive list see (innoQ, 2007). They deal with specific issues not covered by the core WS specifications, making the Web Services technology more suitable to the needs of SOAs. However, these supplementary specifications are not widely supported and their usage can affect the interoperability of the resulting service.

2.3.2.3 Service integration

Service integration is a process of allowing individual services to cooperate by invoking each other’s operations. It includes data format transformations, protocol bridges and service call routing. Enabling service integration is a crucial issue of all service-based platforms. This section presents techniques and technologies used for solving this issue in traditional SOAs with a special focus on Java environment.

Enterprise Application Integration

Many SOA-specific techniques for service interaction and composition have their roots in the techniques, developed for Enterprise Application Integration (EAI) systems. EAI represents the general business- driven need of making multiple diverse enterprise applications work together usually without the possibility to modify the applications themselves. This need emerged long before the service-orientation gained attention as a natural consequence of company acquisitions and merging as well as of a gradual historical spread of specialised enterprise applications for managing customers, inventory, human resources, attendance, accounting, internal communications, etc. The typical motivations for employing EAI solutions include the need to integrate information from multiple information systems, the demand for vendor independence and the need to create a common facade for multiple heterogeneous software systems.

“EAI is the unrestricted sharing of data and business processes among any connected applications and data sources in the enterprise.” (Linthicum, 2000)

Originally, system integration used to be solved on an ad-hoc basis by developing sets of unique, single- purpose applications, allowing two or more specific systems to cooperate in the desired way. This inevitably led to hardly manageable, tangled structures of a high number of point-to-point connections since the number of point-to-point connections could be quadratic to the number of individual applications. Both industry and academia worked hard to tackle this issue in a more systematic and elaborated way. A theoretical background for application integration was established and new, well- founded system integration platforms started to appear around the year 2000.

Technically, EAI can be carried out on various levels, corresponding with the layers present in modern applications (Linthicum, 2000): • data level, where data is extracted from one data store and used to update another data store; • method level, where methods existing in one application are called by another application;

- 39 - • application interface level, where applications are accessed using their published interfaces; • user interface level, where application user interface is accessed programmatically (also known as “screen scraping”).

No general consensus on a single ideal approach, methodology or the ideal choice of technologies for EAI has been reached. Various approaches have been developed suitable for various scenarios, based on various architectures, communication topologies and communication models, being either technology- dependent or technology-neutral, with various levels of scalability, adaptability or extensibility. See (Král, 2000), (Morgenthal, 2001), (Cummins, 2002) or (Hohpe, 2004) for both complex approaches and specific design patterns suitable for typical EAI scenarios. However, mature EAI solutions usually encompass the following: • a centralized broker, taking care about security, access and communication; • an independent data model and format, facilitating conversions between various application- specific data models and formats; • a connector model, enabling each application to communicate with the centralized broker and translating between application-specific data model and the independent data model; • a system model, defining the APIs, data flows and rules, individual applications have to comply with.

A centralized broker is a middleware component, reducing the amount of necessary point-to-point connections by eliminating direct communication between applications. Each application has to be able to communicate just with the broker. Brokers are typically built on top of some message-oriented middleware (MOM), which takes care of the sole message passing. Message brokers typically extend the functionality provided by the MOM by advanced features such as communication through various protocols, message transformation and enrichment, rule-based message routing etc (Gorton, 2006). The concept of a message broker can be further extended to a process broker, encapsulating the process logic for connecting individual applications and executing the pieces of integration logic using a built-in orchestration engine (Johannesson, 2000).

Figure 2.12: Point-to-point and broker-based architecture (Bakker, 2005)

Usage of connectors and independent data models such as EDI for general business data or SWIFT for banking allow application abstraction. Virtually any application can be integrated and just an appropriate connector has to be redesigned in case of application modification or when a connected application should be replaced with another one. This way, EAI solutions exhibit some properties similar to that of SOA – they promote the principles of abstraction, statelessness, loose coupling and composability. In fact, many SOA-specific techniques for service interaction and composition are just adapted and/or standardized versions of the techniques, introduced by EAI engines. The main reason why EAI solutions

- 40 - are not commonly used in SOAs directly is that existing EAI engines are in most cases proprietary, extremely complex and very expensive pieces of software (Kraunelis et al., 2002; Dante, 2005).

There is also a significant philosophical difference between EAI and SOA. They are both aiming to support enterprise business processes with the existing application portfolio. However, they achieve it differently. EAI exposes the existing functionality through integration services, effectively exposing an existing application portfolio as an enterprise business model. In contrast, SOA is hiding the existing applications and exposing a set of application-independent business services instead, projecting an enterprise business model on the existing applications portfolio (Lublinsky, 2007).

Enterprise Service Bus

Enterprise Service Bus (ESB) is a concept introduced by Sonic Software in 2002 (Stamper, 2005). It represents an abstract messaging architecture, reducing the number of point-to-point communication channels by introducing a common message broker middleware. This basic idea is exactly the same as in EAI message brokers. The main difference between ESB and EAI is that EAI message brokers were designed as monolithic centralized components, whereas in ESB, the single communication bus is purely logical and even if it still provides a centralized management facility, the infrastructure can be possibly distributed across the network. This makes ESB more reliable and scalable. ESB represents a lightweight, flexible, standards-based, platform-independent, distributed alternative to the traditional monolithic bus-based proprietary EAI solutions.

“An ESB is a standards-based integration platform that combines messaging, web services, data transformation, and intelligent routing to reliably connect and coordinate the interaction of significant numbers of diverse applications across extended enterprises with transactional integrity.” (Chappell, 2004)

Existing ESB definitions, including the one just mentioned, use lists of expected capabilities to define the meaning of the term. Such definitions obviously leave a lot of space for taking individual approaches to achieving the specified capabilities. But it is hard and even undesirable to define such an abstract architectural pattern more precisely. To provide a better guidance, various authors published lists of more specific criteria, an ESB should satisfy. For example, according to Rademakers (2009), the core functionalities, every ESB should be able to provide, are: • location transparency, i.e. ESB provides a central platform for communication between applications, either synchronous or asynchronous, without coupling message sender to message receiver; • transport protocol conversion, i.e. ESB is able to seamlessly integrate applications with various transport protocols such as HTTP, FTP, SMTP, JDBC,…; • message transformation, i.e. ESB provides support for transforming messages on their route between sender and receiver; • message routing, i.e. ESB is able to determine the ultimate destination of an incoming message based on its content, origin or other attributes; • message enhancement, i.e. ESB is able to add missing information to transferred messages; • security, i.e. ESB provides authentication, authorization and encryption services; • monitoring and management, i.e. being a critical piece in a system landscape, ESB has to be managed and monitored.

In addition to these core functionalities, current ESB implementations often come equipped with various extensions, trying to gain a competitive advance. Very common are support for transactions, support for

- 41 - EAI patterns (Hohpe, 2004) or incorporation of some service orchestration facility such as a business rules engine or a BPEL engine (see Section 2.3.2.4). Other possible sources of a competitive advantage are available adapters (some authors use the term gateway instead) – specialised pieces of software, allowing one to connect to ESB also applications, which do not support standardized access mechanisms directly.

Figure 2.13: Basic ESB architecture (Rosen et al., 2008)

What makes ESB different from a simple message bus is a set of built-in facilities. For increasing its flexibility and adaptability to the needs of various usage scenarios, ESB takes the “configuration over coding” approach. The provided facilities are configurable administratively, which facilitates easy reconfiguration and promotes loose coupling between services on the bus. ESB provides a sophisticated environment for event-driven system integration and is currently a preferred approach to achieve system integration. ESB-based integration solutions are provided by major software vendors (IBM, Microsoft, Oracle, TIBCO, Sonic,…) and there are also many open-source projects available (Open ESB, Apache ServiceMix, Mule, JBoss ESB, Petals ESB,…).

Although integration is usually not mentioned as a primary SOA aim or principle, most service-oriented systems are created on top of existing enterprise application portfolio to avoid complete redevelopment and protect former investments. Therefore, ESB is often used together with Web services, providing a transport layer, to implement a messaging layer of SOA.

Java Business Integration

As already mentioned, the concept of ESB is inherently quite vague and there is a lot of space left for the creativity of the designers of its implementations. Standardization applies to service description or service interaction mechanisms, but the service container itself and the way individual services are deployed into the container remain proprietary. Commitment to a specific ESB implementation can therefore still lead to vendor lock-in, as it used to be with EAI platforms. The Java Business Integration (JBI) specification was developed to solve this issue.

- 42 - The JBI 1.0 specification was approved as JSR 208 within the Java Community Process (JCP). It defines a Java-based ESB, running in a single JVM instance. More specifically, it standardizes the container that hosts services, the binding between services, the interactions between services, and how services and bindings are loaded into the container.

Figure 2.14: Top-level view of the JBI architecture (JSR 208)

The JBI architecture adopts the idea of a microkernel. The JBI container itself provides just essential capabilities of message delivery, provided by the normalized message router, and infrastructural services. For providing a required functionality (XSLT, BPEL, rules engine, EJB, Java beans, routing engine,…), one has to plug in a specific service engine (SE). For supporting a required communication protocol (HTTP, SMTP, JMS, LDAP, file, JDBC, CORBA, XMPP,…), one has to plug in a specific binding component (BC). This way, JBI represents a pluggable architecture of a container of containers – SEs and BCs are installed to JBI runtime and serve in turn as containers for application deployment. Both SEs and BCs should be portable, i.e. they should work in any JBI container. Unfortunately, this is more an exception than a rule because of implementation lapses and incompatible versions of libraries. JBI applications are called service assemblies. A service assembly contains one or more interlinked service units, each defining service consumer(s) and/or service provider(s) for a specific SE or BC. The way a particular SE or BC is configured by a service unit is out of the scope of the JBI specification. Service units are therefore bound to a specific SE or BC and generally cannot be deployed to another one.

JBI represents a message-based approach to system integration. WSDL-described services communicate with each other by sending XML-formatted normalized messages, which eliminates the possibility of coupling between message sender and message receiver. The supported message exchange patterns (MEPs) are in-only, robust in-only, in-out and in-optional-out as defined in the WSDL 2.0 specification (W3C, 2007). For ensuring reliable message delivery, the communication within JBI container is asynchronous, based on message queues.

Both commercial and open-source JBI implementations equipped with dozens of SEs and BCs are available. Commercial JBI implementations usually appear integrated into larger integration solutions (e.g. by Oracle or TIBCO). On the other hand, open-source projects are generally more devoted just to providing a JBI environment (e.g. Apache ServiceMix, OpenESB or OW2 Petals). To provide an enterprise-level reliability and scalability, it is very common that JBI implementations bypass the limitation to a single JVM and provide some proprietary support for container clustering.

- 43 - JBI 2.0 initiative (JSR 312) was started in 2007, aiming to improve some JBI 1.0 shortcomings and clarify the relation of JBI to Java EE, SCA and OSGi. However, some of the committee members (namely BEA and IBM) blocked the process because of their commitment to SCA.

Service Component Architecture

Service Component Architecture (SCA) is a set of specifications, developed since 2005 together with the accompanied SDO specification (see Section 2.3.2.1) by a group of commercial software vendors, labelled “Open SOA Collaboration”. In 2007, the official set of version 1.0 specifications was published (Open SOA, 2007) and the initiative was turned over to the OASIS consortium to reach the formal industry standardization. At the time of writing this thesis, version 1.1 specifications are under active development.

SCA stays on the border between integration and composition, because it is capable of both. It describes a language- and protocol-independent, metadata-driven service composition model for building distributed service-based applications. It provides also a programming model for creating service- oriented components in various programming languages and a way to describe how such components should be assembled to provide a required functionality.

A unit of deployment is a “composite”, a logical grouping of multiple interlinked components (see Figure 2.15). Components themselves deal with business logic. They offer their functionality through services with defined interfaces, which may be either published for usage outside the composite or consumed just by other components of the composite. SCA supports various MEPs, and both synchronous and asynchronous service invocation. Besides service invocations, SCA components can communicate also by means of event throwing and processing in the publish&subscribe style. A component may have some configurable properties and may declare dependencies on external services. Dependency resolution is accomplished externally in a “dependency injection” style. SCA supports hierarchical service composition by nesting of composites – a composite may become an implementation of a component of a higher-level composite. The SCA Policy framework, based on WS‑Policy and WS‑PolicyFramework specifications, allows composites, components, services and even individual operations to declaratively specify their QoS requirements.

SCA Domain

Composite X Composite Y Composite Z

Service Component Component Reference A B

Promote Wire Promote

implementation implementation Composite A Composite B

Figure 2.15: SCA elements and their relations (Open SOA, 2007)

- 44 - SCA specifications are quite vague in realization of the proposed concept which can be considered both a strength and a weakness. They permit any way of specifying a service contract as well as any way of implementing it. In consequence, alignment between the two is generally not assured and one may have difficulties with interpreting the contract specification. What is more, the way components communicate is left up to their developers. Composing an SCA composite out of heterogeneous components can be therefore very demanding. SCA components may be implemented using a variety of programming languages, including but not limited to Java, C++, C, COBOL, PHP, BPEL, XQuery and XSLT. In reality, however, the focus of the specification committee is currently primarily on Java and BPEL. Components may be distributed, i.e. a single composite may span multiple physical machines. Although it is not a strict requirement, SCA environment is well suited for using SDOs to represent data.

SCA is more a fresh approach to building composite service-based applications than a mediating middleware like an ESB, bringing existing applications to participate in a SOA. However, if necessary, legacy applications can still be integrated into SCA environment using a wide range of common technologies (e.g. WS, EJB, JMS, JCA, RMI, RPC, CORBA,…). This way SCA satisfies many of the criteria imposed on ESB. But the notion of the bus is missing which makes SCA and ESB architecturally different concepts. In principle, SCA can be used for service composition in an ESB-based environment.

SCA is currently supported by major software vendors, including IBM, Oracle, Red Hat, SAP, Siemens, Sybase or TIBCO. The only exception is Microsoft, promoting its WCF technology instead. The reference SCA implementation is being developed within the open-source Apache Tuscany project. Besides this one, there is again a whole list of both commercial and open-source SCA-compliant products (IBM WebSphere, Oracle Fusion Middleware, TIBCO ActiveMatrix Service Grid, OW2 Frascati, Newton, Fabric3,…).

Enterprise portals

All the above-mentioned techniques facilitate service integration on the functional level. Portals represent a different approach – a web-based presentation-oriented aggregation of content, possibly originating from various sources. This way, portals have the potential to offer a single point of entry to multiple applications with the added values of a single sign-on authentication, access control, customization, personalization or searching. In enterprise scenarios, enterprise portals can be used to provide the single place through which users interact with the underlying business processes which in turn interact with the underlying enterprise systems portfolio.

Portals should be able to aggregate almost any kind of content. But for making the Java-based portal content providers more unified and for supporting advanced capabilities such as user sessions, personalization and security, the portlet specifications were developed under the Java Community Process – version 1.0 as JSR 168 and later on, version 2.0 as JSR 286. These specifications define portlet as a pluggable component, running within a portlet container either locally or remotely, processing user interactions and providing a specific piece of content – a fragment – to be included as a part of a portal page. Resulting portal pages are collections of portlet-generated markup fragments presented side-by- side without mixing their functionality. Similar portlet technology is available in the .NET environment too.

As an extension to the platform-dependant portlet specifications, the concept of a presentation-oriented web service was introduced by the Web Services for Remote Portlets (WSRP) specification, developed by the OASIS consortium (2003). The basic idea of WSRP is that portals use the widely supported and platform-neutral WS technological stack to interact with remotely-running portlets through well-

- 45 - defined interfaces. Consequently, portlets became highly interoperable and reusable pieces of software, much like ordinary web services. And the other way round – WSRP gives web services the possibility to come with a user interface. WSRP can be used together with JSR 168, JSR 286 or .NET portlets (see Figure 2.16).

Figure 2.16: A WSRP-enabled portal, aggregating markup from remote portlets (Castle, 2005)

As a result of more than a decade of the existence of enterprise portals, there are currently dozens of enterprise portals available. Most of them are commercial applications, provided by major software vendors (IBM, JBoss, Microsoft, Oracle, SAP, TIBCO,…), but one can find also open-source alternatives (Apache Jetspeed, OpenPortal, Hippo). According to the list2 provided by Wikipedia, the vast majority of available enterprise portals are based on the Java EE technology and as a natural choice, almost all of them respect either JSR 168 or JSR 286.

2.3.2.4 Service composition

Service composition is a process of combining and linking multiple services to form a new, aggregate service, providing a previously non-existing functional construct(s). As the number of available services grows during the existence of a particular system, service composition should gradually become the most important way of providing a new functionality.

The terms “orchestration” and “choreography” are often used to describe the two principally different viewpoints, or approaches to defining service compositions. Barros et al. (2006) define them as follows: • Choreography captures collaborative processes involving multiple services where the interactions between these services are seen from a global perspective. • Orchestration deals with the description of the interactions in which a given service can engage with other services, as well as the internal steps conducted by the service between these interactions (e.g. data transformations).

2 http://en.wikipedia.org/wiki/Enterprise_portal#Enterprise_portal_vendors (cited 2010-04-29)

- 46 - Orchestration is an imperative approach, where a process, driven by a single controller, is defined. Such a process is often a subject to an automated execution by an orchestration engine. On the other hand, choreography declaratively describes a protocol for cooperation/conversation of multiple autonomous peers. The choreography itself is therefore not directly executable. It can be realized by defining or deriving an orchestration process for each peer involved. Nevertheless, according to Papazoglou et al. (2007) such a sharp distinction between orchestration and choreography is rather artificial, and the consensus is that they should coalesce in a single language and environment.

When considering the implementation of a service composition, there are various ways to go, including the following (Rosen et al., 2008): • Programmatic implementation using a general-purpose programming language. This approach suffers from multiple drawbacks. General-purpose programming languages are rarely suited for managing transactions or asynchronous invocations. Hard-coded composition logic is hardly modifiable and the resulting compositions tend to be tightly coupled. • SCA composition (see Section 2.3.2.3). Composition can be achieved in the SCA environment both by an orchestration, performed by BPEL-based components within a single composite, and by hierarchical nesting of composites. • Event-based implementation using a publish/subscribe intermediary. The resulting composition is very flexible, but suffers from the lack of a composite service instance and its execution context. The autonomous nature of event throwing and processing by the participating services makes it very difficult to implement any form of support for transactions. • Orchestration using an orchestration engine. Specifically tailored orchestration languages, such as BPEL, allow one to design the orchestration logic conveniently in a visual editor. Orchestration engines are able to take care of many important concerns by themselves, including service instantiation, asynchronous method invocations, state management, load balancing or support for transactions. This again makes designing the orchestration logic a lot simpler. Additional flexibility can be achieved by complementing the orchestration engine with a business rules engine, which can provide a user-configurable decision support facility for orchestration processes (Rosenberg, 2005).

The history of orchestration languages is strongly related to the history of Web services. Shortly after the WS stack was introduced in around 2000, promoting interoperable software services, the first Web service orchestration languages, such as XLANG by Microsoft started to pop up. Unfortunately, the headlong rush of various companies or initiatives to fill the gap with their own products led to the existence of a plethora of mutually incompatible, competing specifications, aiming at various aspects of service collaboration (see Figure 2.17). Within a few years, the situation became a bit clearer, because some of the specifications were merged, other were abandoned or superseded and, what is most important, further development of successful specifications was passed to respected standardization bodies, such as W3C, OASIS or to large industry consortia such as BPMI or OMG. Nowadays, the situation is still somewhat confusing, but the Web Services Business Process Execution Language (WS‑BPEL, or just BPEL for short), maintained by the OASIS consortium (2007), is the preferred service orchestration language for most vendors (Mendling, 2008) despite its known limitations such as absence of a standardized graphical notation or problematic incorporation of human tasks.

- 47 - 2000 2001 2002 2003 2004 2005 2006

OASIS ebBPSS ebBPSS2.0

WS-BPEL2.0 BPEL4WS1.1 (Draft)

IBM Microsoft, WSFL IBM, BEA Microsoft BPEL4WS XLANG

Intalio BPMI BPML BPML BPMN BPMN1.0 (OMG)

W3C WSCI HP WS-CDL WS-CDL1.0 CDL WSCL Legend

WfMC Wf-XML2.0 Emerging point Wf-XML1.0 Wf-XML1.1 (draft) Direct effect

XPDL 1.0 XPDL 2.0 Indirect effect

Figure 2.17: Genealogy of standards in business process modelling and execution (Ju, 2006)

Besides a full-fledged orchestration engine, SOA environments usually offer also some kind ofa “lightweight” orchestration, based on enterprise integration patterns (Hohpe, 2004). In some simple scenarios, low-level routing and transformations using a small set of interlinked EIPs may be sufficient. However, it is not advisable to use EIPs for implementing complex business logic because it would be hardly manageable and difficult to understand.

Orchestration, or even the service composition in general, is closely related to business process modelling. But it is important not to confuse these two concepts. Service compositions can be modelled using similar visual notations as business processes and they can even be based on the outcomes of a previously conducted business process modelling. But still, it remains a lower-level task, which occurs at a different phase of software development and requires different skills.

2.3.3 Development style

We currently have best practices for object-oriented analysis and design (OOAD) as well as widely recognised graphical notations, such as Unified Modeling Language (UML), and proven software development methodologies, tailored for object-oriented development, such as Unified Process (UP) and Model-Driven Architecture (MDA). Service-orientation and object-orientation expose some similarities, but there are also fundamental differences between the two, preventing the direct usage of object-oriented techniques and processes in a service-oriented scenario. For example, the traditional roles of system analysts, developers and testers are not sufficient for SOA development. Business analysts, responsible for business modelling, are also needed and the business modelling phase has to occur somewhere in the SOA development process. The issue of service-oriented analysis and design, as opposed to OOAD, was probably tackled for the first time in (Zimmermann et al., 2004).

As service-orientation is becoming the mainstream enterprise software engineering style in the last years, various techniques were developed, applicable in various phases of the service engineering process. Reference SOA architectures, such as the one published by The Open Group consortium (2009),

- 48 - provide a general blueprint for a SOA-based enterprise architecture so that the enterprise architect can use it as a template that will be instantiated during each individual project. Graphical notations were developed or adopted for modelling purposes. Most notably the SoaML UML profile (OMG, 2009) for architectural modelling of service-based systems and the Business Process Modelling Notation (BPMN) for business process modelling (BPMI, 2004), both currently being maintained by the OMG consortium. Best practices for service design and development were formulated as design patterns. For example (Erl, 2009) provides the list of 85 SOA patterns of various kinds.

To provide guidance in the process of developing a SOA, methodologies covering the whole process of analysis, design, development, testing and deployment are available. The traditional object-oriented methodologies are not directly applicable for service-oriented development because of inappropriate or missing activities and associated deliverables. The Service-oriented Modelling and Architecture (SOMA) methodology was developed by IBM to tackle this issue (Arsanjani, 2004). The methodology is being actively developed up to these days, together with a specific tooling. However, both the methodology and the associated tools are not publicly available. Another popular methodology is the Mainstream SOA Methodology (MSOAM), whose principles were introduced in (Erl, 2005). It provides a generic set of approaches and processes for delivering SOA-based projects. It is publicly available and takes a similar approach to the UP in that it is intentionally designed to be quite generic so that it can be easily customized and extended for specific needs. The latest addition to this list is the Service-Oriented Modelling Framework (SOMF), created and popularized by Bell (2008), providing both the process guidance and its own graphical modelling notation. It supports multiple levels of abstraction (conceptual, analysis, design and architecture) and specifies the relationships between the artefacts defined at each level. Whether SOA is or is not suitable for applying agile software development methodologies remains unclear. No general consensus has been reached so far. Arguments for both statements were summarized by Elssamadisy (2007).

Generally, several clearly differentiated approaches to service identification and design exist (Terlouw, 2009): • top-down, based on complete business analysis and business decomposition; • bottom-up, based on application-centric requirements; • meet-in-the-middle, combining the top-down and bottom-up; • middle-out, creating both low-level and high-level artefacts based on the reference architecture.

The top-down approach is ideal from the theoretical point of view and has a high probability of the effort being relevant to the enterprise. But it is also the riskiest and most expensive one because it suffers from late production of useful deliverables. The bottom-up approach, on the other hand, typically results in a set of isolated ad-hoc designed services, not delivering the benefits of SOA because of missing business alignment, and suffers from high maintenance and rework efforts over the long term. The two compromise approaches are generally considered better. Meet-in-the-middle starts with a top- down analysis and when it is sufficiently progressed, the bottom-up analysis and development is started too, taking the existing results of the business analysis into account. Periodic reviews are performed to compare the design of current services with the current state of business model and services are reworked when needed. The middle-out strategy starts with developing a SOA reference architecture. The reference architecture is a stable centre, allowing independent iterative development of both low- level and high-level artefacts.

A nice thing about service-orientation is that services can be developed and deployed incrementally as time, resources and policies allow. Not necessarily at once. The “start small” approach to building a

- 49 - SOA, as it is often called, is currently recommended by many authorities because of lower risks, lower barriers and lower initial resource requirements (Pizette et al., 2009).

SOA development is not limited to adding new services to the portfolio and orchestrating thereof. SOA is designed for flexibility. For being able to adapt itself to the changing requirements. It can not only support the business as it evolves, but it has also the potential to contribute to the optimization of the business. To realise this potential, the system can and should be continuously monitored, the monitoring results should be analysed and the system should be reconfigured when beneficial together with the corresponding real business processes. Such a permanent optimization cycle is in line with the principles of the business-driven development methodology (see Figure 2.18).

Figure 2.18: Activities involved in business-driven development (Mitra, 2005)

2.3.4 Deployment

In principle, SOA does not impose any restrictions on the technological environment it is deployed to (software platform, operating system, hardware, network infrastructure,…). Unlike monolithic systems, service-based applications are by design well suited for distributed deployment. However, most enterprise-level SOA-based applications keep the traditional deployment model of being run within a closed environment(s) of the system owner(s). The distribution, if any, rarely crosses the border of this environment. Typically, both the individual services and the supportive infrastructure such as ESB or the service registry are being deployed onto the hardware owned by the system owner(s), equipped with enterprise-level operating systems and application servers, whose maintenance requires skilled specialists. This deployment model is supported by major software vendors, promoting their “SOA platforms” (IBM, JBoss, Microsoft, Oracle, SAP,…).

Within the last few years, hardware virtualization became an important topic of conversation, primarily in the context of server machines. The primary reason for this is that the capabilities of current hardware are rarely fully utilized and most of the time it stays idle. In 2005 and 2006, Intel and AMD included support for hardware-assisted virtualization into mainstream CPUs, making the effective hardware virtualization available to anyone. Running several virtual machines within a single physical hardware results in lower hardware costs, lower energy consumption, lower space requirements and facilitates maintenance. For all these reasons, deploying SOA elements onto virtualized machines is an advisable and a still more common practice too. However, it does not contradict the previously stated assertion about closed environment(s) of the system owner(s) in any way.

- 50 - 2.3.5 Evaluation

Every technology delivers the maximum value just when being used in the right way, in the right situation, for the right purpose and with realistic expectations. Awareness of both the strengths and weaknesses of a particular technology helps to achieve it.

SOA strengths: • Reduction of the business – IT gap because of its business-drivenness. • Faster and cheaper reaction to changing requirements, even during development. Service compositions can be reconfigured easily because orchestrating processes are relatively easy to change. • Replaceability of service implementations. Properly designed contracts prevent vendor lock-in. • Accountability. Clearly defined service contracts lead to clear responsibilities. • Reusability. Properly designed lower-level services have high potential to be reused. • User-oriented service interfaces facilitate system specification. • Manageable, controlled development because of system decomposition to clearly defined services. • Services can be developed, deployed and updated incrementally, not necessarily at once. • Channel neutrality. SOA is agnostic on transport protocols, message formats and data models. • Scalability. Stateless services generally scale well.

SOA weaknesses: • Processing overhead. High level of abstraction always introduces some processing overhead and SOA is not an exception. • Latency, caused by mediation and message transformations. • Complexity of service design and implementation requires extra efforts and investments. • Problematic integration with legacy systems. Simple wrapping of legacy system APIs is not enough, because they were typically not designed with principles of service-orientation in mind (appropriate granularity, statelessness,...). • Achieving the business – IT alignment typically forces some changes in business, which is rarely welcome. • It is not suitable for real-time systems or systems interchanging large amounts of data. • Methodologies and best practices are still under investigation (Feuerlicht, 2007). • Mature development tools and environments (CASE) are not generally available.

2.4 Web 2.0 architectures

The first occurrence of the term “Web 2.0” is to be dated already to the 1990s, when Darcy DiNucci used it in the article, envisioning the future of the web (DiNucci, 1999). But the term did not attract any extraordinary attention until the “Web 2.0 Conference” was held in 2004, using it as a general label for the new spontaneously evolved trends apparent in the web. Consequently, Tim O’Reilly in his famous article (O’Reilly, 2005) summarized the features, common for the applications of the innovative companies, which not only had survived the bursting of the “dot-com bubble” in 2001, but which seemed to be stronger that time than ever before. Quickly, the article became the de-facto Web 2.0 manifesto and the identified features started to be considered the characteristics, distinguishing the new generation of the web: • the web as platform; • harnessing collective intelligence;

- 51 - • data is the next Intel inside; • end of the software release cycle; • lightweight programming models; • software above the level of a single device; • rich user experiences.

Web 2.0 is both a usage and a technology paradigm. It’s a collection of technologies, business strategies, and social trends, enabled by an unprecedented growth in global numbers of personal computers, Internet- enabled mobile devices, Internet users, social network users, blogs etc. in the last decade (Pitner, 2009). It is hard to define or demarcate the Web 2.0 precisely, because – as stated by O’Reilly – it does not have a hard boundary, but rather a gravitational core. Therefore, it is usually being described using a set of characteristic features instead of being strictly defined. However, the recurrent demand for the definition caused O’Reilly to formulate it (see below). In practice, the definition has never reached the level of adoption of the original O’Reilly’s article and the principles formulated therein are still used for explicating the meaning of the term in most cases.

“Web 2.0 is the business revolution in the computer industry caused by the move to the Internet as platform, and an attempt to understand the rules for success on that new platform. Chief among those rules is this: Build applications that harness network effects to get better the more people use them.” (O’Reilly, 2006)

The term “Web 2.0” itself still arouses controversy. Even some respected authorities dispute the novelty of its principles and criticise its vagueness, allowing an unjustified exploitation of the term for marketing purposes:

“I think Web 2.0 is, of course, a piece of jargon, nobody even knows what it means. If Web 2.0 for you is blogs and wikis, then that is people to people. But that was what the Web was supposed to be all along.” (Berners-Lee in Laningham, 2006)

Strictly speaking, Sir Berners-Lee is right. From the technology point of view, Web 2.0 is indeed nothing more than making the best out of the original techniques and technologies that proved to be the most successful in the web environment. However, there is an obvious shift from presenting a static content to providing dynamic services for users and communities. Web was transformed to a read/write medium, where people not only consume, but also provide information, communicate and cooperate. This is a fundamental change from the previous state regardless of the original intention of the inventors of what is now called the first generation Web. Whether this change justifies the “2.0” label or not is a subject of endless disputations, which can hardly result in anything beneficial and which are intentionally left out of this thesis.

One can also come across threatening warnings against possible unwanted consequences of imprudent usage of Web 2.0 services (First Monday, 2008). Despite the negative impression they may give, such notices should be definitely considered a positive effort. Any invention is beneficial just when being used properly and Web 2.0 is not an exception. It inherently has some risks associated, primarily related to user privacy, and users should be aware of them not to be disappointed.

- 52 - 2.4.1 Principles

This section deals with the distinguishing characteristics of Web 2.0 and its applications. As already mentioned, Web 2.0 is not just a matter of technology. There is also a significant shift in treating users and in associated business models. But the same way as in the case of SOA, this thesis will focus primarily on the technological aspects. Other aspects of Web 2.0 will be mentioned just briefly and references to external sources will be provided where appropriate.

The original list of characteristic features of Web 2.0 applications, formulated by O’Reilly (see Section 2.4), combines both the technological and other aspects. The technology-related aspects will be elaborated in this section. The non-technological principles were reformulated and elaborated by Anderson (2007), who compiled the following list of “the big ideas behind Web 2.0”: • individual production and user-generated content; • harnessing the power of crowd; • data on an epic scale; • architecture of participation; • network effects, power laws and the long tail; • openness.

As Anderson admits, these ideas are not the preserve of Web 2.0. They are, in fact, direct or indirect reflections of the power of the network – the strange effects and topologies at both micro and macro level, produced by billions of Internet users.

2.4.1.1 The web as platform

Perceiving the web as a target environment for which applications are developed is perhaps the most significant principle, enabling all the others. Native web applications are not sold as packaged goods, but delivered to end users as services. The “software as a service” (SaaS) delivery approach not only changes the way software is developed and distributed, but it has a huge impact on applicable business models too. Instead of buying and owning the software itself, customers pay for the usage of the service, either directly or indirectly such as by consuming advertisements.

Web 2.0 services free their users from software installations. Consuming Web 2.0 services requires just a web browser and an Internet connection, both being ubiquitous these days. Hardware and infrastructure for running the services is under the control of service providers, which facilitates scalability and simplifies development. The web provides a massively scalable infrastructure in its nature. A Web 2.0 service can be run on a single desktop machine as well as delivered by thousands of dedicated servers located all over the world without end users being aware of it. Porting the software for various server- side platforms and environments is also no longer needed.

Six major types of Web 2.0 services were identified in (Drášil et al., 2008), divided by their general interest and focus: • content creation (wikis, blogs, online databases, office suits,…); • content sharing (videos, photographs, presentations,…); • content storage (generic data storage services); • content contextualization, presentation and management (maps, web-based desktops, feed processors,…); • coordination and communication (e-mails, chats, calendars,…); • community (social networks).

- 53 - In addition, the web together with all the available Web 2.0 services can be seen as a programming platform upon which developers can create new software applications. In the classical web, services were accessible only through browsing their web interfaces. In Web 2.0, services should be integration- friendly. Such an integrable service may gain additional popularity either by allowing 3rd party applications to interact with it or by allowing 3rd party plugins to be integrated into it (e.g. Facebook Apps).

Using applications which are out of our control has some inherent risks associated, primarily related to data privacy and security. In the Web 2.0 environment, the importance of software licensing decreases in favour of data licensing. Tim Bray, a co-inventor of XML and former director of web technologies at Sun Microsystems, formulated the commitment for “open” online services with regard to the maintained user data:

“Any data that you give us, we’ll let you take away again, without withholding anything, or encoding it in a proprietary format, or claiming any intellectual-property rights whatsoever.” (Bray, 2006)

2.4.1.2 End of the software release cycle

Unlike packaged applications, online services can be developed continuously as there is no need to group updates. It is therefore not necessary to wait for a clean and mighty product. New products and/or features can be published as soon as they provide some meaningful functionality and bugs or compatibility issues can be solved incrementally as well as extending the functionality according to users’ demand. Adding new data and functionality continuously keeps users engaged and prevents losing the position in the market. This “perpetual beta” principle blurs the traditional distinction between stable and development versions of a software product. For example the GMail service was officially labelled as “beta” for five years, before the label was removed, supposedly for marketing reasons than any other.

Deployment mechanisms have to reflect the “release early and release often” principle too. Deployments of new features should be realized during runtime, without interrupting the service provision. And an easy rollback mechanism should be available for cases when some problems occur. Automated testing and a rigorous build and deploy process are necessary. According to Musser (2006), eBay deploys a new version of its service approximately every two weeks. And the Flickr photo-sharing service reportedly deployed even hundreds of incremental releases during the 18 month period from 02/2004 through 08/2005.

Customers

Public-facing Web Site

Pilot Pilot Sampling and Testing Feature Feature Only a small subset of A B users are exposed to these features

Development Process

Shadow Applications Requirements Process

User Data History Profiling . . .

Figure 2.19: Perpetual beta product cycle (Musser, 2006)

- 54 - In addition to the automated testing, service users themselves are commonly used as testers. Data obtained by real-world user behaviour monitoring are the best source of measurable information for determining the usefulness, popularity and weaknesses of individual functions or the whole service. Various approaches can be taken for involving service users in testing. For example Google services have the “lab” section, where the “beta” functionality is available for interested users. Another approach is taken by Amazon, where a small percentage of service visitors is presented with alternative features and experiences. According to (Musser, 2006), Amazon runs multiple such tests on its live site every day. For meaningful monitoring, capturing and evaluating user behaviour, the user-facing application has to be supplemented with the instrumentation framework of applications running behind the scene and providing such functionality.

2.4.1.3 Lightweight programming models

Web 2.0 takes a completely different approach for achieving reusability than SOA. SOA promotes standardization, whereas Web 2.0 favours simplicity. Web 2.0 generally promotes pragmatism and simplicity over ideal, universal design. Simple interfaces and data formats facilitate reusability because of low technical barriers and inherent loose coupling. In addition, it accelerates the development of both the services and their client applications.

Offering pure data through simple interfaces leaves maximum space for unanticipated ways of using it and for assembling existing services in novel ways. It lets developers easily and quickly create new web applications that draw on data, information, or services available on the Internet.

2.4.1.4 Software above the level of a single device

The “web as platform” principle abolishes porting applications to various server-side environments. However, the diversity of client-side environments is greater than ever before. Clients use various versions of various web browsers, equipped with various versions of various plugins. And what is more, clients are no more limited to the PC platform. Other devices such as PDAs, mobile phones, multimedia players, televisions etc. are nowadays able to access the Internet to both consume and provide data from and to Web 2.0 services. This versatility represents an enormous potential for service providers. Applications limited to a single device of any kind have limited utilization.

2.4.1.5 Rich user experiences

The term Rich Internet Application (RIA), was coined by Macromedia in 2002 to promote the Macromedia Flash technology. It stresses the fact, that web-based applications, delivered through web browsers, can offer the same level of performance and user comfort as installed desktop applications. Such applications can profit from a low entry barrier for end-users which in turn boosts the users’ performance and creativity. User-provided data is the driving force of Web 2.0 and RIAs made putting even rich multimedia content to the Web a trivial task.

2.4.2 Technologies

Similar to SOA, Web 2.0 is not bound to any particular technology. But still, a set of commonly used technologies can be identified, that proved to be well suited for buildingWeb 2.0 applications.

- 55 - 2.4.2.1 Data formats and their handling

As stated in Section 2.4.1.3, simplicity is the most important criterion for Web 2.0 data formats. The simpler the data representation, the more consumable it is. Web 2.0 applications also often provide at least part of their functionality by means of client-side processing, conducted by the web browser. Therefore data accessing and processing has to be simple enough to be done using JavaScript.

For a general object interchange between a Web 2.0 service and its client, be it either uni- or bi- directional, the JavaScript Object Notation (JSON, RFC 4627) and XML are dominant serialization formats. Both of them are standardized, language-independent and easily treatable by JavaScript. JSON has the advantage of lower transport and processing overhead.

For supplying data to clients, standardized XML-based data syndication formats, such as Really Simple Syndication (RSS, 2009) and newer Atom (RFC 4287) are typically used. Especially RSS has been adopted to syndicate a wide variety of content, ranging from news articles and headlines, changelogs for source code repositories or wiki pages, project updates, and even audiovisual data such as radio programs. Besides these general-purpose data formats, there are also specialised notations such as RDF- based Friend-of-a-friend (FOAF) format for describing relations of individuals to other people, or Atom- based Activity Streams format for syndicating activities taken in social networks.

A specific means for publishing information, especially metadata, are microformats. These “parasite” formats are based on established standards such as HTML 4 or RSS, giving XML attributes and/or elements a specific interpretation. A document enhanced with semi-structured semantic information, encoded using a microformat, remains standard-compliant. Microformats cover many areas of general interest – personal and organisational contact information (hCard, XFN), calendars and events (hCalendar), ratings and reviews (VoteLinks, hReview), licenses (rel-license), tags, keywords, categories (rel-tag), lists and outlines (XOXO). Information encoded using microformats can be used by web browsers, search engines or any other applications.

2.4.2.2 Service foundations

Specific data transfer protocols are used for specific purposes – SMTP for sending e-mails, XMPP for instant messaging, BitTorrent for data distribution. But HTTP is currently optimal general-purpose communication protocol for Web 2.0 services because of its simplicity and easy passing through firewalls in the Internet. In addition, securing an HTTP-based communication using the SSL/TLS layer is easy for both clients and servers.

Traditionally, additional messaging layers are used on top of the HTTP protocol, such as SOAP or XML‑RPC. Instead, Web 2.0 adheres just to realising the full potential of the HTTP protocol and promotes the Representational state transfer (REST) communication style. REST is an architectural idea and set of principles introduced by Fielding (2006). It describes an approach for stateless client/server architecture, providing a simple communication interface using the HTTP protocol. Every resource is identified by a URI and intentions are communicated through GET, POST, PUT and DELETE HTTP command requests. Although REST is not a formal standard, it is recognized and widely used. Many software development tools and environments include some kind of support for both providing and consuming RESTful services: • Java EE 6 platform includes the JAX-RS framework for creating RESTful web services in Java (JSR 311);

- 56 - • Microsoft’s Windows Communication Foundation (WCF) provides a programming model for both implementing and consuming RESTful services using the Microsoft .NET framework; • sqlREST service turns any relational database into a RESTfully navigable web service.

Although Web 2.0 generally promotes openness, reliable methods for identifying service users are necessary both for ensuring privacy and for giving users the feeling of identifiability, making them responsible for their actions. It is certainly possible to solve this issue in a proprietary way, but this way users end up with dozens of service-specific identities and credentials. Lightweight digital identity initiatives supporting single sign-on represent an ideal solution to this issue. A typical example of such a technology is OpenID (2007), which has gained a significant level of acceptance by both service providers and service users during the last few years. In addition, the proliferation of composite applications built on top of existing Web 2.0 services created the need for an authorization mechanism, allowing users to give an application a strictly determined level of access to their Web 2.0 service account without revealing credentials to that application. The OAuth specification (RFC 5849) was developed to tackle this issue and similarly to OpenID, it is on its way to becoming the standard, widely accepted solution.

The simplicity of RESTful APIs makes it possible to get along without formalized, machine-readable notations for describing the APIs. Although the WSDL 2.0 specification (W3C, 2007) allows it as well as the Web Application Description Language (WADL) specification (W3C, 2009), created specifically for that purpose, neither of them is commonly used. Similarly, community-based service discovery is used instead of sophisticated automated mechanisms. For a good example of a community-managed Web 2.0 service registry, see ProgrammableWeb3. At the time of writing (January 2011), it maps almost 2,700 APIs and twice as much mashups built on top of them. What is more, the directory does not just list the APIs. It provides also sophisticated analysis, matrices, scorecards and the like.

2.4.2.3 User interface

The technologies and techniques for developing user interfaces are missing in the overview of SOA technologies in Section 2.3.2, because with the exception of enterprise portals, traditional SOA solutions typically do not care about user interfaces. But for Web 2.0 applications, rich user interfaces are a crucial issue (see Section 2.4.1.5). On top of the existing web environment based on server/client communication and IP-based network architecture with web browsers as the user agents, interactive applications resembling their desktop counterparts are built using client-side scripting technologies like Asynchronous JavaScript and XML (AJAX) or specialized runtime environment modules like Adobe Flash, Microsoft Silverlight or JavaFX. These environments require installed browser plugin to work, but especially the Flash Player plugin can be considered almost ubiquitous with the penetration of 99 % (Adobe, 2010).

AJAX is a collection of several technologies combined to enrich the user interface, make it highly interactive and more responsive. Namely of XHTML or HTML, cascading stylesheets (CSS), document object model (DOM), XML, XSLT, XMLHttpRequest and JavaScript. XML may be possibly replaced by JSON for data interchange. AJAX allows processing user requests by exchanging just small amounts of data with the server instead of reloading the whole web page, which results in a more responsive user interface. Frameworks such as Google Web Toolkit (GWT) or OpenLaszlo make it easy to develop and debug both server- and client-side code of AJAX applications, which is not trivial in pure AJAX. GWT application is developed in Java and compiled into an AJAX-based application, shielding developers from technical details of AJAX and allowing them to benefit from advanced Java IDEs. Libraries of prearranged AJAX-enabled GUI components are available for these frameworks as well as for direct usage (Yahoo! UI Library, Dojo Toolkit or jQuery). 3 http://www.programmableweb.com

- 57 - In addition, the upcoming HTML5 specification will extend the standardized means of delivering the content to end users significantly. Many features currently available just through proprietary runtime extensions will become parts of the standard. The new features include multimedia playback, support for offline web applications, canvas for 2D drawing and support for cross-domain AJAX. Although the specification is still under development, more specifically in the W3C “working draft” stage, recent versions of popular web browsers have already some features included and various services have already started experimenting with it, e.g. YouTube started to offer an HTML5-based video player in 01/2010 in addition to the default Flash-based one.

2.4.2.4 Service integration

Web pages or applications, integrating multiple Web 2.0 services are often called “mashups”. Although many successful mashups can be seen in the Internet, it is still hard to find a clear and concise definition of the term.

“Mashups are an exciting genre of interactive web applications that draw upon content retrieved from external data sources to create entirely new and innovative services.” (Merrill, 2006)

Generally, integration in Web 2.0 is not as varied as it is in SOAs (see Section 2.3.2.3). Typically, it consists just in combining the data provided by multiple services, because the majority of Web 2.0 services are data-centric. In addition, interoperability of Web 2.0 services is achieved in a different way than in traditional enterprise-oriented SOAs. It is based on simple protocols and simple data models which, among other things, allows moving it from servers to clients (Drášil et al., 2008). A good example of integration via simple, yet standardized API can be found in the area of search engines. Each modern web browser (Chrome. Firefox, Internet Explorer, Safari,…) is pluggable with modules communicating with specific search engine. This is enabled by the OpenSearch format, covering identification and description of the search engine, query and response formats, search suggestions etc. The ease and rapidity of creating mashups is one of the most important and valuable features of Web 2.0.

Contextualization

Service New Open Customizing Qualities Standardization Social Service Networking Integration

Aspects

Community

Coordination

Collaboration

Communication Sharing Storage Presentation Content Management Context Creation

Figure 2.20: Mash-tree, an allegory of mashup principles (Drášil et al., 2008)

- 58 - For a mashup to be meaningful and successful, it has to provide some added value above the integrated services. This can be achieved by three main means (van der Vlist, 2007): • Enhanced user interface. Drawing on data from mostly one source, this type of mashup provides a better interface. For example, a better way to navigate through information, a more responsive interface, or the presentation of more relevant information by displaying only a subset of information that is of particular interest to the user. • Value-added information by aggregation. By bringing together information from various sources on the Web, this type of mashup adds value by aggregating the data, making the combined data more relevant. • Value-added information augmented with an enhanced user interface. This type of mashup both aggregates data from different sources and presents the data with a better user interface.

The basic decision developers have to make when designing a mashup is whether the code accessing particular services and combining their data or functionality will be run on server or on client. This distinction together with the overall intention of the mashup allow us to differentiate between four technological approaches commonly used in current mashups (Ort et al., 2007): • Client-side scripting mashups. A JavaScript code, running in user’s browser is used both for retrieving the data from involved Web 2.0 services and for its processing and/or aggregation (see Figure 2.21). Applicability of this approach is limited by JavaScript security restrictions – any code is allowed to communicate just with the domain it came from. Necessary prerequisite for this approach is therefore that service providers themselves provide JavaScript libraries for communicating with their service. Optionally, Adobe Flash or similar client-side runtime environments can be used instead of JavaScript. Client-side models generally limit developers in terms of operations allowed and available technologies, but result in lower latency and do not burden service provider’s hardware.

Figure 2.21: Client-side mashing (Ort et al., 2007)

• Client-side widget mashups. Another popular way to mash is to use web widgets for accessing the content managed by the involved Web 2.0 services. For example an embeddable YouTube video player, a Flickr preview widget, a Google search panel, a Delicious social bookmarking widget or a combined toolbar. The idea is similar to that of enterprise portals, but the whole process is moved to client browser. The information is just presented side by side instead of

- 59 - being really mashed up. Applications offering embeddable widgets instruct the end-users how to include the widget in a blog, a static web page, a wiki page etc. • Server-side mashups. Integrating applications in a middleware tier, placed between them and the client, is considered the best practise in enterprise environment. In Web 2.0, this approach can be used too (see Figure 2.22). Remixing the content on the server side concentrates non-functional requirements such as computational power and data storage there. Scalability necessary for satisfying these requirements continuously may be achieved either traditionally by employing a sufficiently powerful own application server(s) or by hosting the application in “the Web 2.0 way” on a cloud using some “Infrastructure as a service” (IaaS) service. The main advantage of this approach is that the remixed content can be cached in the server. Individual services are accessed through their APIs either directly or by employing some higher-level libraries.

Figure 2.22: Server-side mashing (Ort et al., 2007)

• Content syndications. Very simple mashups can be based just on RSS or Atom syndication feeds. Blogs, magazines, and other Web 2.0 applications provide syndication feeds organized into channels, e.g. one channel per blog, and items, e.g. one item per blog entry. Feeds provide basic metainformation about individual items as well as links to the complete original information in a standardized, XML-based format. Integration of information from several sources then means just fetching the feeds using plain HTTP client, combining the feeds and presenting the result to the user.

According to (O’Reilly, 2005), Web 2.0 services should offer an easily accessible API to facilitate their “hackability and remixability”. Services following this advice can gain a significant competitive advantage. Some of the APIs are extraordinarily popular, e.g. eBay reports about 5 billion API calls each month. Services without a published API can be, from strictly technological point of view, employed in mashups too by using web-scraping techniques, i.e. by analysing the HTML code of their web interface and then simulating user actions normally performed in a web browser. This is, however, unreliable, error-prone and disputable from a legal point of view since no formal contract between content provider and content consumer has been established.

Even current desktop applications cannot afford ignoring the Web 2.0 momentum. Many of them come with a built-in support for various Web 2.0 services. Probably the best known representative is Flock, a web browser with integrated support for 22 Web 2.0 services (01/2011). Many video-processing or video-converting tools are able to publish the resulting video to Facebook and/or YouTube, many photo- editing tools allow their users to upload their photos to Facebook, Flickr or Picasa. But even in the case when an application does not come with a built-in support for a specific Web 2.0 service or ignores

- 60 - Web 2.0 at all, the solution can still exist. Functionality of many current applications can be extended by plugins, providing an arbitrary functionality. This way one can for example enrich OpenOffice Impress with the possibility to upload presentations to SlideShare service or teach Firefox how to store your bookmarks in del.icio.us or Diigo.

2.4.2.5 Service composition

Hierarchical service composition with complex built-in logic in the SOA sense is rarely achieved in Web 2.0 mashups. Mashups typically combine presentation and content, which makes mashups themselves hardly reusable in other applications. One exception to this assertion are syndication mashups, where the result of content syndication is often republished as a new syndication feed to be consumed by users’ feed readers.

2.4.3 Development style

Development of Web 2.0 services themselves is to a large extent determined by the “end of release cycle” principle as described in Section 2.4.1.2. Attracting and keeping users engaged is a primary concern, which results in what is often called user-driven development. Users are directly involved in the development process, both as the basic source of requirements and as testers, sometimes even without being aware of it. To maximize success, user feedback must be incorporated early, rapidly, and continuously. Agile and iterative development methodologies suggest themselves together with lightweight technologies as they suite very well for short development cycles of small incremental releases, quickly responding to user feedback.

Similar approach can and should be used in mashup development too. Some types of mashups, especially content syndications, are simple enough to be built by technically less savvy people, not only by experts. The popularity of mashups led even major software companies to offer tools, allowing users with just a little technological experience to create and publish their own “fetch → transform → present” mashups without any programming. In reality, the demand for such services was obviously not very high because many of these tools are no longer available these days (Google Mashup editor, Microsoft Popfly, Openkapow, Sun Zembly). But some of them still exist, both commercial and enterprise-oriented (see Section 2.6) and available for public use (Dapper, MapBuilder, Yahoo Pipes, Wayfaring).

As the trend of integrating support for various Web 2.0 services into traditional applications became apparent, some software development tools started to facilitate it. For example the NetBeans IDE v6.9 has a built-in “Support for popular SaaS services”, enabling simple drag&drop integration of 25 web APIs into the developed application.

2.4.4 Deployment

Web 2.0 comes with a revolutionary change in the models of application deployment. Service providers can now host their applications in a virtual “cloud” without taking care about housing and maintenance of a physically owned server, without being limited to capabilities of a given hardware and without extensive initial investments. The concept of cloud computing, enabled by the state-of-the-art technology and driven by business requirements, is a key Web 2.0 characteristic. Replacement of the traditional computing infrastructure by cloud computing imposes strong requirements concerning quality of service. Service level agreements should specify the details of the service quality precisely – functional specification, availability, reliability, security and risk elimination for a case of disaster.

- 61 - Two distinct kinds of cloud infrastructure services can be found: • Infrastructure as a Service (IaaS), where one can get a virtualized computing environment with pre-defined or agreed performance parameters and software equipment. Amazon Elastic Cloud is a typical example of an IaaS service, instantly offering virtual machines of eleven different performance characteristics, equipped with a freely selectable combination of various operating systems, database engines, application servers, web hosting servers, media servers and the like. • Platform as a Service (PaaS), where one can get a whole computing platform for running custom applications, developed for the platform. PaaS services are often based on a cloud infrastructure. Google App Engine is a typical example of such service, able to run specific Java- or Python-based applications. It is available both as a paid service for business users and as a free, somewhat limited service for public.

Cloud computing infrastructure in highly scalable in its nature. Computational power, storage and network bandwidth can grow as the business grows and clients pay just for what has been really consumed. A classical hosting is scalable too, but the limits are obviously lower. Highly performing and scalable infrastructures necessary for cloud computing are still more often built using large numbers of low cost commodity hardware instead of employing specialised high performance solutions. U.S. Department of Defense is going to build a cluster of 1,700 Sony Playstations, providing a computational power comparable to that of the 2007 world’s fastest supercomputer IBM Blue Gene/L. On a software level, IaaS has been enabled by advances in system virtualization. Virtualization environments by Citrix, KVM, Microsoft, Oracle, Parallels or VMWare provide strong platforms for hosting cloud services.

Although cloud computing providers often use free and open-source software for their infrastructure, the concept of cloud computing is sometimes criticized by open-source software advocates. Richard Stallman, founder of the Free Software Foundation, claims that cloud computing brings just a different kind of the vendor lock-in effect danger by forcing users to rely on proprietary solutions with potentially growing prices in the future (Johnson, 2008). Indeed, transition costs involved when changing the cloud provider can be very high considering different modi operandi and proprietary technology used by the providers.

2.4.5 Evaluation

The ideas represented by Web 2.0 are by no means the most important driving force of current web development. Successful Web 2.0 applications have already shown that they can bring a renewed vigour to long-established fields of computer science such as knowledge management (wikis, tagging, folksonomies) or software development (service orientation, frequent releases, perpetual beta). But certainly, it has its drawbacks too. Both the strengths and weaknesses are summarized here.

Web 2.0 strengths: • Agility. Web 2.0 is a good example of achieving continuous software improvements. • Focus on user. User-centred applications are developed using user-driven methodologies. • Low entry barrier for developers, caused by simple technologies. • Low entry barrier for end users, caused by rich user interfaces. • Scalability of cloud-based applications and infrastructure. • Facilitates creative development of new applications and unanticipated combinations thereof. • Enables new business models, often requiring no initial investments (paying for usage, advertisement-based earnings). • Makes it possible to successfully run highly specialised businesses, interesting for relatively low numbers of scattered customers. • Many services are available instantly, often with a possibility of a free subscription.

- 62 - Web 2.0 weaknesses: • Low control for service users, especially with regard to data privacy. • Problematic quality of service. Many Web 2.0 service providers provide no guarantees and the same applies to many Internet connection providers. • Diversity of terms and conditions. Each service has its specifics, sometimes quite surprising. • Users still have to have multiple accounts and credentials and go through multiple authentications, although a solution exists (OpenID). • Problematic data and application migration, possibly resulting in vendor lock-in. • Uncertainty of continued operation, support, reliability, security and scalability of a particular service. Many Web 2.0 services and APIs have been shut down already, even those being provided by big companies – AOL Xdrive, Google Notebook, Google Wave, Ma.gnolia, MediaMax/ The Linkup, Technorati API, YouOS,….

2.5 Relation of Web 2.0 and SOA

When considering the totally different origin and evolution of SOAs and Web 2.0, it is surprising how close they got at the end. At least from a software architect’s perspective. They both deal with integrating and orchestrating simple, easily accessible services in order to create aggregate services, able to support complex requirements of their users. However, philosophy and technological bases of the two are quite different which leads to different realizations. The technological differences between the two, most notably SOAP vs. REST, can be seen in a wider context of a general ongoing debate, not limited to the technology circles, over simplicity vs. sophistication.

“Simplicity is the ultimate sophistication” (Leonardo da Vinci)

Hinchcliffe (2005) thinks of Web 2.0 as a global instance of SOA, Linthicum (Geelan, 2007) as a universal SOA, ready to connect to an enterprise SOA, perhaps providing more value. Or the other way round, SOA could be seen as an internal cloud that provides online services inside the enterprise. Whether it is so or not is hard to judge because it is more a matter of personal feelings and experiences than of irrefutable facts. Primarily because of an inherently vague character and blurred boundaries of both SOA and Web 2.0. A more conservative point of view is depicted on Figure 2.23, showing the concepts of SOA and Web 2.0 as overlapping, but with clear distinctions.

Figure 2.23: Convergence of Web 2.0 and SOA (Hinchcliffe, 2009a)

- 63 -

This gave rise to a new term, coined by Nicholas Gall from Gartner in 2005, denoting the intersection of the two – a Web-oriented architecture, WOA (Lawson, 2008). However, without any official definition or clear consensus about the meaning of the term, there were lots of discussions about it. Three years later, Gartner published an official note about WOA, including the formal definition of the term:

“WOA is an architectural substyle of SOA that integrates systems and users via a web of globally linked hypermedia based on the architecture of the Web. This architecture emphasizes generality of interfaces (UIs and APIs) to achieve global network effects through five fundamental generic interface constraints: »» identification of resources; »» manipulation of resources through representations; »» self-descriptive messages; »» hypermedia as the engine of application state; »» application neutrality.” (Gall et al, 2008)

The constraints are clearly based on the “uniform interface constraints” of the REST architectural style, formulated by Fielding (2000). The first four constraints are adopted directly and application neutrality was added to make it more explicit. Application neutrality stresses the fact that instead of focusing on implementation neutrality, like the Web Service technological stack does, interfaces should be above all designed to be generic and application-agnostic because this is precisely what makes them reusable.

Figure 2.24: WOA technological stack (Hinchcliffe, 2009b)

As stated by Schroth (2007), because of their frequent implementation in the corporate context, SOAs are subject to requirements that do not exist in the case of most Web 2.0 applications. This has to be agreed as an overall observation, valid at a particular point in time. But as will be shown in the next section, applications of Web 2.0 ideas, principles and even services in the enterprise are still more common. Certainly, Web 2.0 or WOA cannot replace SOA completely and in all cases. Certain applications, particularly in the high-end of the enterprise, can rely on the more sophisticated portions of the SOA stack. But in many cases, WOA could either complement a full-fledged SOA or even be used instead, constituting a more interoperable, easier to implement and more scalable environment.

Recent thoughts about combining the best of both Web 2.0 and SOA introduced the concept of the Internet of Services (IoS) as envisioned by Schroth (2007) or Cardoso et al. (2009). The idea of the IoS is that the Internet becomes a medium for offering and selling services, where service consumers and providers are brought together in service marketplaces.

- 64 - 2.6 Web 2.0 in the enterprise

Although many people think of Web 2.0 as a mainly open community matter with specialized web applications built to profit from it, its opportunities, concepts, and technologies can be utilized also inside the companies to support internal processes and to foster their business. Inspired by the Web 2.0 label, McAffee (2006) coined the term Enterprise 2.0 to name these trends explicitly.

“Enterprise 2.0 is the concept of using tools and services that employ Web 2.0 techniques such as tagging, ratings, networking, RSS, and sharing in the context of the enterprise.” (Lennon, 2009)

Subsequently, Robinson (2008) identified a set of generic technology usage patterns, suitable for exploiting Web 2.0 principles in an enterprise to provide business value: • Syndication, i.e. provision of syndicated access to applications, information and services. • Enterprise mashups, i.e. rapid creation, sharing and evaluation of applications to access and manipulate content and services, augmenting the value of SOA services through composition. • Marketing as a conversation, i.e. engagement with customers through social networking, changing marketing from broad-brush communication to lots of individual, focused conversations. • Community exploitation, i.e. exploiting the low communication costs and social networking capabilities for reaching the “long tail”. • Rich interfaces, i.e. reduction of barriers for interacting with computers by making their user interfaces more familiar and intuitive.

From these patterns, enterprise mashups, sometimes also called business mashups, are of particular interest for the purpose of this thesis. Enterprise mashups are still used to combine existing resources from various sources, but additional needs like security, availability or quality make them more demanding than ordinary Web 2.0 mashups. On the other hand, enterprise mashups can benefit from registries collecting, classifying, describing, rating and monitoring the available resources, which is feasible in the limited scope of the enterprise but hardly achievable on a global scale. Commercially available enterprise mashup platforms include IBM Mashup Center, JackBe Presto and MindTouch Core.

Figure 2.25: Enterprise mashup (Robinson, 2008)

- 65 - ”An Enterprise Mashups is a web-based resource that combines existing resources, be it content, data or application functionality, from more than one resource in enterprise environments by empowering the actual end-users to create and adapt individual information centric and situational applications.” (Hoyer, 2008a)

Janner et al. (2009) defined a layered model of enterprise mashup development (see Figure 2.26), based on the model previously introduced in (Hoyer, 2008a). The layers correspond also with possible means of integrating an enterprise mashup platform with systems from outside the enterprise. The lower level integration is realized on, the higher implementation complexity and lower reach, but the higher potential richness of the solution. The three basic layers are: • Resource layer, where data and/or functionality are made available through APIs. • Widget layer, where resources are fetched, piped and presented. Piping can include operations such as aggregation, merging or filtering. For presenting the resources, a widget (or a gadget as it is called in Figure 2.26) can encapsulate the whole logical flow of screen transformations. • Mashup layer, whereAPIs , even which end encapsulate users without the resources any programming and describe skills the can wireprocess the ofwidgets creating to Mashups, Gadgets and Resources, by create an application.interface through which they are made available. using and sharing them on the one hand, and helping to improve them by giving feedback or developing Mashup themselves, on the other hand. Lightweight resource G3G3 composition stands for an easy (re-)use of once created

G1 G2 artifacts, like Resources, Gadgets or Widgets. Through Wiring G1 G2 piping and wiring, as depicted in figure 2, users are able to use the same artifact in different scenarios. Gadget G1G1 G2G2 G3G3 Several products are available on the Enterprise Mashup market, like i.e. Mashup Center (IBM), Presto Screen S2S2 Edge (JackBe), Mashup Server (Kapow), RoofTop (SAP), flow S1S1 S4S4 S3S3 S5S5 or Pipes (Yahoo), which are categorized together with S3S3 more than 30 tools in [12]. The specific terminology presented in this section follows the concepts of the Screen FAST project [8], where a Web based environment (FAST Gadget Visual Studio) is in development allowing its users to create (enterprise-level) Gadgets for use in Piping Enterprise Mashups.

API 3. Enterprise Mashup Integration Patterns for B2B Scenarios Services/ ERP CRM Web API Web Feed Resources In this chapter, we introduce five different patterns Figure 2. Mashup/ Gadget Development Layers and Figure 2.26: Enterprise mashup development layers (Janner et al., 2009)describing options to realize B2B scenarios using Terminology. Mashup platforms and Mashup/ Gadget development Employing Web 2.0 techniques for communication with customers is quite commontools. These these patterns days. are described with the help of a Gadgets or widgets primarily put a face on the sample scenario and a definition of the user profiles Surprisingly, most companiesunderlying are not very resources far along by in providingadoption of athe same graphical techniques for their internal involved in Enterprise Mashup development. We use the purposes (Nielsen, 2009).representation But this ambivalence for them andis hardly piping sustainable the data receivedand there are many companies terminology as introduced in section 2.2. already experimenting withfrom it, thegaining resources. measurablePiping business can include benefits operators (Bughin, like 2010). aggregation, merging or filtering. According to [16], the 2.7 Summary Mashup Stack can be extended for Enterprise Mashups 3.1 User Profiles Involved in Enterprise Mashup by complex gadgets which consist of several screens. Development Screens are fully functional by themselves, and their pre- Both SOA and Web 2.0 are high-level concepts, which can be realized in many possible ways. On and post-conditions drive the transitions among them to the conceptual level, theytie have them manytogether, things forming in common, a Screenflow but .there are also clear distinctions between the two. Almost all of them were already mentioned in the respective sections. The similarities stem BusinessBusiness Mashups: “By assembling and composing a collection End User primarily from the fact ofthat widgets both SOA [called and Web wiring 2.0 in promote figure 2]the storedprinciples in aof service-orientation as catalogue or repository, end users are able to define the behavior of the actual application according to their MashupMashup Scenario Scenario individual needs. By aggregation- 66 - and linking content of End User different resources in a visual and intuitive way, the end users are empowered to create their own workspace which fits best to solve their heterogeneous business problems. No skills in programming concepts are GadgetsGadgets Key User/ required.” [12]. UI UI UI Consultant Hoyer et al [16, 3] also point out three further design principles of Enterprise Mashups. Emerging Resources/Resources/ Services Services intermediates name the functionality that provides a Developer

registry for the growing amount of resources which are Enterprise Mashup User Mashup/ Gadget Development Tool User available for use in Mashups. They offer features to Figure 3. Users Profiles in Scope of Mashups and collect, classify, describe, rate and monitor the resources Mashup / Gadget Development Tools. to make them available to end users. Mass collaboration describes the participation of many end users in the

978 described in Section 2.3.1. The main differences between the two are summarized once again in a simplified, but more comprehensible form in Table 2.1.

SOA Web 2.0 Business aspects Typical business model paying for ownership paying for usage Initial investments high/very high low/none Overall cost of development and high/very high low/moderate maintenance Development aspects Governance essential low Development driven by business end users Interoperability based on sophistication simplicity Composition techniques sophisticated (processes, programs) lightweight (piping, wiring) Skills necessary for development high moderate to low IDE support high growing Speed of development low very high User interfaces out of interest very important Product is released when it is perfect good enough System usage and evolution planned spontaneous, unintended Runtime aspects Primary target environment enterprise Internet Primary target device server/PC server/PC/cell phone/PDA/TV/… Runtime infrastructure owned/hosted cloud-based Integration/composition either server-side (web server) or server-side middleware environment client-side (web browser) General aspects Control very high low Security very high low/uncertain Privacy very high uncertain Availability very high high/uncertain Transactionality fully supported not supported from very high (utility services) to low from very high (data feeds) to Reusability (business services) low (presentation mashups) Scalability moderate very high Agility high very high moderate, either human-based or low, human-based (facilitated by Discoverability automatic (using metadata and registries) community)

Table 2.1: Overall comparison of SOA and Web 2.0

- 67 - 3. Reality of Web 2.0 services

This chapter is devoted to an in-depth analysis of various non-functional aspects of real Web 2.0 services. A number of specific Web 2.0 services will be selected and examined from several viewpoints – both technological and non-technological. Where interesting or important, other Web 2.0 services will be mentioned too. The purpose of this analysis is twofold. Firstly, it should either confirm or refute the assertions made in Section 2.4, regarding the Web 2.0 principles and technologies. Secondly, findings in the technological area will become the major drivers for designing the integration framework in the following chapter. Non-technological aspects will not influence the design of the framework directly, but still, they have to be kept in mind when choosing the right services for an application built upon the proposed framework.

We have already published a similar study in (Drášil et al., 2008), but much more details and updated information will be provided in this chapter.

3.1 Selection of representative services

The first step of the analysis has to be a selection of the representative set of Web 2.0 services for further examinations. This step is very important because it has a major impact on the overall results of the analysis. The aim is to get a reasonable number (at least 20) of various popular Web 2.0 services, which could possibly play the role of functionality providers in the proposed platform.

There are several community-maintained online directories of Web 2.0 applications available, each of them listing thousands of applications (see Table 3.1). The obvious discrepancy in numbers of listed applications is caused primarily by two factors – by differences in the requirements imposed on listed applications and by activity of the community using and maintaining the directory. For example, listio.com is a very general directory of any Web 2.0 tools, applications and services. On the other hand, programmableweb.com lists only web applications with published APIs which is a very limiting requirement. Size of the community can be – at least relatively – qualified by the traffic reported by analytic services such as Alexa4 or Compete5. Both these services clearly show go2web20.net being the most popular general Web 2.0 directory of these days. However, the limitations imposed on the listed applications by programmableweb.com perfectly fit our needs and its popularity is comparable to go2web20.net, leaving all other directories in question far behind.

Updates last Compete: unique Alexa: traffic rank Applications listed month visitors last month (less is better) programmableweb.com over 2,000 55 130,000 12,510 allthingsweb2.com over 2,800 0 2,000 184,100 go2web20.net over 3,200 37 184,000 8,873 ziipa.com over 8,500 294 14,000 84,180 simplespark.com over 8,500 0 10,000 305,579 listio.com over 12,600 82 33,000 45,991

Table 3.1: Online Web 2.0 directories (data collected 06/2010)

4 http://www.alexa.com/siteinfo 5 http://www.compete.com

- 68 - As a result of the above-mentioned survey, programmableweb.com was taken as the basic catalogue of available Web 2.0 services and some of its advanced features were used for selecting the representatives. Every service registered therein belongs to one of the 53 pre-defined categories and has an associated list of mashups using its API. Categories can help us to achieve a high diversity of the selected services if we avoid selecting multiple services from the same category. And the number of mashups built upon a service’s API is a good indicator of popularity and possibly also quality of each service. By taking the service with the highest number of mashups from each category, we can get the list of 53 candidate services. In the last step, candidate services were filtered according to the number of mashups once again. Only services with 30 or more mashups built upon their API were taken which resulted in the set of 21 Web 2.0 services (see Table 3.2).

Number of services Service with the highest Number of mashups Category in the category number of mashups built using the service Mapping 117 Google Maps 1975 Photos 53 Flickr 523 Video 72 YouTube 472 Social 134 Twitter 402 Shopping 79 Amazon Product Advertising API 350 Music 74 Last.fm 154 Bookmarks 18 del.icio.us 150 Search 66 Yahoo Search 137 Widgets 18 Google Gadgets 95 Storage 22 Amazon S3 62 News 27 Digg 61 Internet 152 Amazon EC2 56 Advertising 21 Google AdSense 55 Tools 62 Google App Engine 54 Other 125 Google Chart 52 Reference 93 Wikipedia 41 Events 17 Eventful 36 Feeds 14 Google Ajax Feeds 35 Telephony 67 Twilio 34 Calendar 6 Google Calendar 34 Blogging 24 FeedBurner 31

Table 3.2: Representative Web 2.0 services

The representativeness of this set of 21 Web 2.0 services is supported by the following facts, derived from the data acquired from the programmableweb.com directory: • Each of the selected services belongs to a different category than all the others. • If we take the number of mashups built using a service as a measure of its popularity, each of the selected services is the most popular one in its category. • If we take the number of mashups built using a service as a measure of its popularity, all the selected services belong to the top 3 % most popular Web 2.0 services regardless of the category. • If we take the number of services in the category as a measure of its significance, 12 out of 15 most significant categories are covered by the selection. • 63 % of all the Web 2.0 services belong to one of the 21 covered categories.

- 69 - The presented data not only support the representativeness of the selected set of services. They also illustrate nicely the “long tail” principle – a hyperbolic characteristic common to many Web 2.0 quantitative characteristics with a few extremely popular/frequent/known/… elements accompanied with a large number of elements with incomparably less significance. As an indirect result of this phenomenon, the 21 selected services are owned and run by just nine companies. There are nine services owned by Google, three by Yahoo, three by Amazon and just six by other subjects. Web 2.0 is, among other things, a lucrative and highly competitive environment where strong players often squeeze the others from the market or swallow them up.

3.2 Technological aspects

Many current web services are loudly supporting the Web 2.0 principles. But a significant portion of them is using this fashionable label for marketing purposes only and is somewhat remote from really implementing its ideas. For example, one of the basic Web 2.0 principles – “the web as platform” – is hard to implement without services offering their functionality for programmatic access. But our former analysis (Drášil et al., 2008), correlating with another result (Novak, 2007), showed an alarming lack of realization of the Web 2.0 principles, since one third of the services reviewed have been created for direct usage by humans only. That is why it would be injudicious to rely just on the theoretical assumptions about what technologies Web 2.0 services should use, as stated in Section 2.4.2. To design the platform so that it is capable of integrating actual Web 2.0 services, it is crucial to review the real technological aspects of actual Web 2.0 services.

When considered interesting and beneficial, the results of the analysis will be supplemented or compared with specific notable examples of individual Web 2.0 services or with the results obtained in a similar study conducted two years ago (Drášil et al., 2008). It is important and interesting to note, that although both the studies tried to review the most popular Web 2.0 services of its time, the two service sets differ significantly. There are just three services common for both the surveys (Amazon S3, Flickr and Google Calendar). This fact clearly illustrates the dynamism of the Web 2.0 environment. Services can easily fade into oblivion when they are not able to keep the pace with their competitors.

3.2.1 Communication protocols

Regarding communication protocols, the services reviewed are extremely unified. Without an exception, all of them are based on the HTTP protocol (RFC 2616). As revealed in the older study, the only exception to this rule are communication-related services, where specific protocols such as POP3 (RFC 1939), SMTP (RFC 2821), IMAP (RFC 3501) or XMPP (RFC 3920) usually come into play.

3.2.2 Messaging models

Although some services utilize the traditional RPC messaging model in the form of either SOAP (see Section 2.3.2.2) or XML‑RPC (see Section 2.2.3.1), the most popular messaging model today is REST. At least if we would trust the proclamations in service API descriptions. The term is, however, often used in a much looser sense than it was originally defined by Fielding (2000). In most cases, such services offer just a simple interface transmitting domain-specific data over HTTP without an additional messaging layer and call it RESTful. In fact, just three of the services reviewed (Twitter, Amazon S3 and Twilio) are really using various HTTP methods to express the semantics of the request and URLs to identify resources on which the requested operation should be run in the true spirit of REST.

- 70 - Google, one of the major Web 2.0 companies, developed its own generic protocol for reading and writing data on the Internet, called GData (Google, 2010). It is based on RSS (2003), Atom (RFC 4287) and the Atom publishing protocol (RFC 5023) and allows data updates, too. Google uses this home- made protocol in most of its APIs.

Service Messaging model(s) used Google Maps plain HTTP Flickr plain HTTP, SOAP, XML-RPC YouTube GData Twitter REST, plain HTTP Amazon Product Advertising API plain HTTP, SOAP Last.fm plain HTTP, XML-RPC del.icio.us plain HTTP Yahoo Search plain HTTP Google Gadgets plain HTTP Amazon S3 REST, SOAP Digg plain HTTP Amazon EC2 plain HTTP, SOAP Google AdSense SOAP Google App Engine not applicable Google Chart plain HTTP Wikipedia plain HTTP Eventful plain HTTP Google Ajax Feeds plain HTTP Twilio REST Google Calendar GData, CalDAV FeedBurner plain HTTP

Table 3.3: Messaging models used by the reviewed Web 2.0 services

For some services, there are multiple messaging models listed in Table 3.3 and the situation will reoccur in the ongoing parts of the analysis. There are two possible reasons for this, not distinguished in the table. Firstly, many Web 2.0 services offer multiple APIs, each providing a specific functionality. For example, the Google Maps service has in fact five different APIs (Google, 2011). When this is the case, messaging models of all the service APIs are mentioned in the table. Secondly, a service may offer multiple semantically equivalent APIs. The two main reasons for this are the freedom of choice for service client developers and – especially in newly established services which are trying to attract users away from competitive services – copying the API of some rival service in order to make the transition between services easier (e.g. ma.gnolia.com offered del.icio.us-like API). Again, all the available messaging models are mentioned in the table when this is the case.

3.2.3 Data formats

The data is transferred in HTTP responses in various formats including service-specific XML, standardized XML such as KML (OGC, 2008), JSON (RFC 4627), CSV, binary data such as PNG (ISO/‌IEC 15948:2004) and programming language-specific formats such as serialized PHP.

- 71 - It is not an exception when a service API supports multiple formats and service client developers can choose which format they want to receive data in. For example MediaWiki, the technological platform of Wikipedia, supports even eight generic output formats (see Table 3.4).

Service Data format(s) used Google Maps custom XML, JSON, KML, CSV Flickr XML (custom/XML-RPC/SOAP), JSON, serialized PHP YouTube GData Twitter custom XML, JSON, ATOM Amazon Product Advertising API custom XML Last.fm XML (custom/XML-RPC) del.icio.us custom XML, JSON Yahoo Search custom XML, JSON Google Gadgets XML (custom/RSS/ATOM), JSON, text Amazon S3 custom XML Digg custom XML, JSON, JavaScript, serialized PHP Amazon EC2 custom XML Google AdSense not applicable Google App Engine not applicable Google Chart PNG Wikipedia multiple JSON-based, multiple PHP-based, WDDX, XML, YAML Eventful custom XML Google Ajax Feeds custom XML, JSON Twilio custom XML, JSON, CSV, pre-formatted HTML Google Calendar GData FeedBurner custom XML

Table 3.4: Data formats used by the reviewed Web 2.0 services

In addition to full-fledged APIs, listed in the table, RSS and Atom syndication formats are very popular for data retrieval. Services often offer not only pre-defined RSS/Atom channels, but also provide their users with the possibility of defining their own channels with data selected by user-specified criteria.

3.2.4 Authentication

Authentication is a crucial issue when integrating a particular service into an independent environment. Usually, Web 2.0 services offer some portion of their functionality without the need to authenticate. At least some RSS/Atom feeds. And one can find also services which do not use any authentication at all, because they just publicly and statelessly process client-provided inputs to outputs without needing to know who the client actually is. A typical example of such a service is Google Chart, transforming client-provided data into graphical charts in PNG or GIF image formats. But for exploiting the full potential of a service, some type of authentication is required in most cases. Generally, each of the reviewed services can be classified within one of the following categories: • service does not require any authentication at all • service authenticates client application, i.e. each client application has a unique API key associated and relevant API requests have to be associated with a specific API key somehow

- 72 - • service authenticates user, i.e. the service maintains an account for each service user and relevant API requests have to be associated with a specific user account somehow • service authenticates both user and client application, i.e. combination of the previous two

The reasons for associating API requests with a specific service user are straightforward – assigning request to the right user account and access control. Associating API requests with a specific client application is used primarily for accounting purposes – service providers may want to know and possibly limit who develops applications using their APIs and for what purposes. For example, the Dropbox service approves only mobile client applications at the time of writing (08/2010). It also allows service providers to set limits to the usage of their API (e.g. to one query per second or 1,000 queries per day for a registered application as SlideShare does) or to ban a misbehaving client application. Another possible source of the need to authenticate client applications may be a user authentication scheme. Some user authentication schemes, most notably OAuth, require the client application to be registered with service provider, because a shared secret required for future user authentications is established during the registration process.

Client application Service User authentication authentication Google Maps none no Flickr browser-based yes YouTube OAuth, browser-based, custom HTTP-based no Twitter OAuth, HTTP basic no Amazon Product Advertising API digital signature (shared secret) no Last.fm browser-based yes del.icio.us OAuth, HTTP Basic no Yahoo Search none yes Google Gadgets OAuth (for accessing remote content) not applicable Amazon S3 digital signature (shared secret) no Digg OAuth no Amazon EC2 WS-Security, digital signature (shared secret) no Google AdSense custom HTTP-based no Google App Engine yes, but details are not published not applicable Google Chart none no Wikipedia custom HTTP-based no Eventful custom HTTP-based yes Google Ajax Feeds none yes Twilio HTTP Basic no Google Calendar OAuth, browser-based, custom HTTP-based no FeedBurner browser-based, custom HTTP-based no

Table 3.5: Authentication methods used by the reviewed Web 2.0 services

Current Web 2.0 services use a wide range of techniques for authenticating their users and client applications (see Table 3.5). The methods range from standard solutions, such as OAuth, WS-Security or HTTP Basic to completely home-grown solutions. These include sophisticated mechanisms based on shared secrets and digital signatures as well as simple inclusion of the credentials in a specific HTTP header or in HTTP request body.

- 73 - A relatively novel approach to authentication in web environment is the OAuth protocol (RFC 5849), gaining more and more popularity and adoption. It allows users to authenticate to the service and grant temporary access rights to a 3rd party application without revealing the credentials to that application. This perfectly fits the needs of Web 2.0 mashups and constitutes a great improvement in confidentiality. In our original study, we encountered several home-grown protocols, providing similar possibilities like OAuth and in most cases they are still available (look for the “browser-based” item in Table 3.5). But two years later, they are largely being deprecated in favour of OAuth, being the standardized solution. At the time of writing this thesis, a new version of the OAuth specification (v2.0) is being finalized, supported and implemented by major Web 2.0 players such as Facebook, Microsoft, Twitter or Yahoo.

When accessing services’ web interfaces through the web browser, classical HTML-form-based inputs for user login and password remain to be ubiquitous. However, adoption of digital identity initiatives, most notably OpenID (2007), is constantly rising. Even large web sites and/or Web 2.0 services provided by AOL, BBC, Facebook, Flicker, Google, IBM, MySpace, PayPal, WordPress or Yahoo support OpenID. Usually in the role of an OpenID provider and by allowing an existing account to have an OpenID associated as an alternative login method.

3.3 Developer support

The basic support for service client developers is service API documentation. This can be found in a variety of forms and extents with every service offering a public API. Basically, the list of available operations is provided together with the detailed description of request and response formats. Publishing the communication protocol gives developers a maximum freedom in choosing the technology for building client applications.

However, dealing with – in most cases HTTP – requests and responses directly is error-prone and inconvenient. To address this issue, some services go even further and provide libraries for accessing their API using high-level constructs in common programming languages such as Java, JavaScript, PHP, Ruby, Python, or .NET. Such libraries are often open-source and co-developed by service users, which is beneficial for both service provider and service users. Where such libraries are not provided officially, third-party products can be usually found easily for popular APIs and programming languages. In this context, the “third party” usually represents some enthusiastic service user(s). Table 3.6 provides specific examples of high-level programming languages for which vendor- and community-provided libraries exist.

Service Vendor-provided Community-provided Google Maps ActionScript, JavaScript JavaScript ActionScript, C, ColdFusion, Common Lisp, cUrl, Flickr none Delphi, Java, .NET, Objective-C, Perl, PHP, Python, REALbasic, Ruby Java, .NET, Objective-C, YouTube ActionScript, Ruby PHP, Python ABAP, ActionScript, C++, Clojure, ColdFusion, .NET, Twitter ActionScript Eiffel, Erlang, Java, JavaScript, Lasso, Objective-C, Perl, PHP, PL/SQL, Python, Ruby, Scala, T-SQL Amazon Product Android, iOS, Java, .NET, Java, JavaScript, Perl, PHP, Python, Ruby, Advertising API PHP ActionScript, C#, C++, Java, JavaScript, .NET, Last.fm none Objective-C, Perl, Python, PHP, Ruby

- 74 - Service Vendor-provided Community-provided del.icio.us none Common Lisp, C#, Java, .NET, PHP, Perl, Python, Ruby ActionScript, Java, Yahoo Search none JavaScript, Perl, PHP, Python Google Gadgets JavaScript none Android, iOS, Java, .NET, ActionScript, C, C#, ColdFusion, .NET, Perl, Python, Amazon S3 PHP Ruby Digg ActionScript, PHP, Python Java, JavaScript, .NET, Perl, PHP, Python, Ruby Android, iOS, Java, .NET, Amazon EC2 C#, ColdFusion, JavaScript, Perl, Python, Ruby PHP Google AdSense C#, Java, PHP, Python PHP, Ruby Google App Engine Java, Python none Google Chart none ABAP, C#, Grails, Java, Perl, PHP, Python, R, Ruby Delphi, Haskell, Java, JavaScript, .NET, Perl, PHP. Wikipedia none Python, Ruby, Tcl Emacs Lisp, Java, JavaScript, Eventful ColdFusion, .NET Perl, PHP, Python, Ruby Google Ajax Feeds JavaScript none Twilio C#, Java, PHP, Python, Ruby C++, ColdFusion, Java, .NET, Perl, PHP C#, Java, JavaScript, PHP, Google Calendar ActionScript, Ruby Python FeedBurner none PHP, Python, Ruby

Table 3.6: Examples of programming languages for which one can find libraries for accessing APIs of the reviewed Web 2.0 services

3.4 Legal aspects

Each of the reviewed services provides its users with a document named “terms of use” or “terms of service” (ToS), although it is not always easy to find it. The style of these documents ranges from a list of clear and simple explanations to complex documents spanning 10 or more printed pages. The latter is typical especially for services provided by larger companies (Google, Yahoo etc.). Some services are aware of this difficulty and provide users with a simplified version of their ToS, explaining selected aspects in a more comprehensible way. Google is the only service provider offering some ToSs in Czech.

These legal documents deal primarily with two points. First, they specify in which ways and for what purposes the service can and/or must not be used by its users. Second, they define the relation between the service provider and a service user, especially with respect to the data provided by the user.

3.4.1 Terms of usage

Although each service has its specific ToS, some aspects are common for all the reviewed services. Firstly, services are, not surprisingly, sensitive about user accounts – they require accounts not to be created in any automated way, allowing creating accounts for humans only, and forbid sharing accounts among multiple people. Secondly, services are anxious about the data and therefore explicitly prohibit any kind of systematic harvesting or indexing of its content. The third common point is software copyright. It is not allowed to copy, reproduce, alter, modify, reverse engineer or create derivative works from any of the services. In addition, some services explicitly prohibit incorporating their graphical user interface

- 75 - into other applications or accessing it via any automated means (generally called “web scraping”). But even when these activities are not explicitly mentioned, they are probably violations of the previously mentioned rule. To my best knowledge, the only service directly supporting incorporating its GUI into another application is Zoho, which allows using the Writer, Sheet and Show applications from its office suite as editors for appropriate document formats.

Naturally, there are also aspects in which service providers are not so uniform. For example when considering a commercial usage of a service, one can meet all possible approaches – absolute prohibition (SlideShare), full permission (Amazon services) and obligation to obtain an explicit written permission from service provider in advance (Diigo, DivShare, Google Calendar, MySpace,…).

All services restrict user-supplied content not to violate copyrights and not to contain vulgarity, nudity, racism and other improper or illegal information. For some services, this is the only limitation. Other services go further and restrict the data type in accordance with the service’s intention – to personally taken photos (Flickr), for example. Another example of content limitation can be found in Google Base service, which allows textual data to be in English and German only. One can be also surprised by the limitations on service users themselves, e.g. Digitalbucket requires its users to be at least 18 years old.

In services with a published API, one can often find special statements dealing with the usage of the API. It is not surprising, because a public API provides an easy way to abuse the service or use it in a forbidden way massively. The legal statements and agreements are often accompanied with some technologically enforced constraints, protecting the service from being overloaded by the API calls. Musser (2007) stated that the vast majority of APIs listed in the ProgrammableWeb directory have imposed some limitations on how much they can be used. At the same time, he identified 12 distinct ways of doing so, each being really used by some existing Web 2.0 service. In most cases, the number of requests within a given time frame is restricted. For example the Yahoo Search service allows each client application to make no more than 5,000 requests daily and del.icio.us requires a one second delay between subsequent requests.

3.4.2 Content licensing

Many Web 2.0 services are data-centred. The data can be provided by the service itself or, much more frequently, by its users. Many services just manage data provided or created by their users. The rules for handling the data are therefore very important. Generally, content licensing should be dealt with in two directions – with respect to service provider and with respect to other service users. Data access restrictions for other service users and/or public can be also considered part of the problem.

As a general rule, services do not claim ownership of data provided by users. On the contrary, they disavow the data in order to prevent themselves from accusations of publishing law infringing data. On the other hand services need to have special rights to the content due to their functionality – e.g. image sharing services would not be able to create thumbnails without the right to modify supplied pictures. At this point, services’ conditions differ significantly – some services claim solely the rights necessary for providing their functionality, while others claim much wider rights, e.g. using supplied non-private data for advertising or promotion.

Data privacy is another important criterion. All the reviewed services are treating personal data and the data labelled as “private” as confidential and reveal them without the submitter’s approval only under very serious conditions, explicitly named in their ToS.

- 76 - Almost all services allow their users to restrict access of other users to supplied data using sophisticated authentication and authorization mechanisms. It usually includes private access (only owners can access the data), public access (everybody on the Internet can see or even modify the data) and also some specification in between. Approaches to data sharing vary a lot.

Licensing user-supplied published data for other service users is not considered by many services and is typically left up to users. Just a few services (e.g. Flickr and Slideshare) deal with this type of licensing, systematically allowing users to assign one of the pre-defined copyright licenses based on Creative Commons (2007) to every piece of data.

- 77 - 4. Proposed integration framework

“Web 2.0 Platform” is a contribution to the field of Web 2.0 services’ integration. Its basic idea is that it delivers the functionality, actually provided by remote Web 2.0 services with various APIs, to its client applications through a service-neutral well-defined interface. This way, it facilitates composition of individual Web 2.0 services to support complex functional constructs. And what is more, it bridges the traditional SOA and the upcoming Web 2.0 paradigms, described earlier in Sections 2.3 and 2.4, by making it possible to seamlessly integrate Web 2.0 services into any SOA. Source codes are freely available on SourceForge6.

The development process was inspired by the Unified Process methodology and the sections of this chapter correspond with the disciplines defined therein. Although developing an abstract service- oriented framework differs significantly from a standard object-oriented development project, where UP is used typically, the basic ideas and workflows of the methodology are flexible enough to be used in other contexts too. For describing the platform, UML terminology and diagrams are used throughout this chapter. Recommendations for using UML in a UP-driven development given in (Arlow, 2005) are followed where possible.

4.1 Requirements

A natural first step in developing any product is collecting requirements. In software products, requirements can be generally divided into functional and non-functional. Functional requirements are statements of what the system should do, whereas non-functional requirements represent constraints, placed on the system. A more detailed categorization of requirements can be derived from the FURPS model, introduced in (Grady, 1992), defining the following software quality attributes: functionality, usability, reliability, performance and supportability. Each identified requirement will be given a specific code for referencing it in the ongoing text and, the other way round, for making it easy to find the sections concerned with addressing a specific requirement.

As a basis for specifying the requirements, three actors, i.e. roles interacting with the platform, were identified: • Client application, making use of the platform. This can be virtually any kind of application, ranging from simple clients, providing nothing but a graphical user interface for using the platform-provided application logic, to enterprise-level SOA-based information systems. • Client application developer, who needs to know the details about what functionality the platform provides to be able to develop a client application. • Web 2.0 service, providing a specific piece of application logic to the platform.

Functionality requirements are primarily domain-specific, but architecturally significant functional requirements, such as security, auditing, printing or licensing also belong to this category in the FURPS model: • F1: The platform shall provide its client applications with application logic, based on the functionality delivered to the platform by the backing Web 2.0 services. • F2: The platform shall provide client applications developers with information about a platform- provided application logic, sufficient for building client applications

6 http://sourceforge.net/projects/web2platform/

- 78 - • F3: The platform shall manage user-specific information, necessary for dealing with the backing Web 2.0 services on behalf of the user (login and password in most cases, but not limited to these). • F4: The platform shall be able to manage user accounts. User account management can be also external to the platform, in which case this functionality can be disabled. But still the platform shall to be able to work on its own and provide some fallback option for user account management and authentication when no external mechanism is employed. • F5: The platform shall support various user authentication mechanisms such as a local user database, OpenID and LDAP. • F6: The platform shall provide a level of security, currently considered sufficient for protecting noncritical personal data.

Usability requirements are concerned with user interface and user-related aspects in general. In our case, client applications developers and Web 2.0 service integrators can be considered “users” of the platform and the following requirements, rather general at this stage, were formulated with respect to them: • U1: The platform shall enable an easy development of client applications, meaning that in addition to the sole technological feasibility, it should be also easy and straightforward for developers to develop a client application, accessing the platform. • U2: The platform shall enable an easy integration of additional Web 2.0 services, meaning that in addition to the sole technological feasibility, it should be also easy and straightforward for developers to integrate additional Web 2.0 services into the platform (including both development and deployment phases).

Reliability requirements reflect the fact, that the overall reliability of the platform is inevitably limited by the reliability of the backing Web 2.0 services: • R1: The platform shall be fault tolerant with regard to Web 2.0 services’ APIs, meaning that the platform has to be designed so that it is capable of recovering from short-time temporary unavailability of a Web 2.0 service API.

Performance is not a crucial issue for the platform. Unless processing inside the platform would be extremely ineffective, the interaction with Web 2.0 services’ APIs will remain the most limiting factor for the overall responsiveness of the platform. Large numbers of concurrent client application requests are also not supposed at this stage. The performance requirements are therefore quite relaxed and tailored for satisfying dozens of concurrent users: • P1: The platform shall be able to process 30 client application requests per second on a commodity desktop PC. Note that this limit does not cover processing the request by a remote Web 2.0 service, which is of course completely out of the control of the platform.

Supportability requirements imposed on the platform are concerned primarily with its extensibility and compatibility. For general-purpose middleware systems such as the platform, these are the fundamental requirements: • S1: The platform API shall be consumable by a wide range of client applications, meaning that it should be technologically feasible to access the platform from various kinds of client applications, running on various kinds of devices (web, desktop, mobile). • S2: The platform shall be able to communicate with a wide range of Web 2.0 services, meaning that it should be technologically feasible to integrate typical Web 2.0 services into the platform (based on the analysis in Chapter 3).

- 79 - When following the UP methodology, functional requirements are captured and visualised using use-case diagrams. The use-case diagram shown in Figure 4.1 shows the above-mentioned high-level functional requirements. VisualThe Paradigm only for UMLrefinement Standard Edition(Masaryk is thatUniversity) it assumes organization of the platform-provided application logic into services and operations.

Web 2.0 Platform

Manage user accounts

Authenticate user

Manage user's credentials <> Client application

Call service operation

<> Web 2.0 service

Get information about available services and their operations

Client application developer

Figure 4.1: Web 2.0 Platform use-case diagram

4.2 Analysis and Design

The purpose of the analysis and design discipline is to transform the system requirements into a design of the system-to-be, evolve a robust architecture of the system and adapt the design to match the implementation environment (Shuja, 2008).

When considering appropriate software architecture to satisfy the formulated functional requirements, the microkernel architectural pattern (Buschmann et al., 1996; Buschmann et al., 2007) was selected as an optimal basis to start from. Primarily because of its pluggable nature, separation of a minimal functional core from extended functionality and its adaptability to changing interfaces. The microkernel pattern is known and used especially in the field of operating systems (Hydra, Mach, L4,…), but it can be useful in other software systems as well. It defines three basic kinds of participating components (see Figure 4.2): • Microkernel, the main component implementing just a minimal functional core, primarily the communication facilities. It serves also as a socket for plugging extensions and coordinating their collaboration. • Internal servers, realizing the functionality of the system. They are not visible to clients and only accessed through the microkernel. • External servers, offering user interfaces and APIs to clients by using the microkernel. They are the only way for clients to access the microkernel architecture.

- 80 - User Microkernel

display function_1

do something function_2 Internal Server

External Server (API) route_request function_3

function_1 register_svr function_2 unregister_svr System function_3

Figure 4.2: Microkernel architectural pattern (Buschmann et al., 2007)

For the Web 2.0 platform context, the original microkernel pattern was modified slightly. The platform was meant to provide neither GUIs nor multiple APIs so there is no need to count with multiple external servers. Instead, the platform satisfies the requirements S1 and U1 by providing a single well-defined, generally usable API. As a result, the following architectural elements are defined for the platform: • a core, covering the microkernel and the only external server of the platform. It is responsible for delivering the required functionality to client applications through an API as well as for delivering a suitable description of the published API to client application developers. • a connector, acting in a role of an internal server. It offers a service-neutral API, whose operations are realized by accessing a particular Web 2.0 service API much in the spirit of the Object adapter pattern (Buschmann et al., 2007). • a component, composing a functionality, provided by a connector or even by multiple connectors into a higher-level functional constructs. This is certainly a different functional construct than in the case of a connector as described in the previous point. However, at the end they both play the same role of an internal server, providing some functionality to client applications through the platform core. The way of doing so should be completely transparent and there should be no difference between calling a simple functionality, provided directly by a connector, and calling a complex composed functionality, provided by this type of component. For this reason, this type of component is called a connector too. To differentiate between the two connector types when necessary, the terms primary connector (the one providing its functionality by calling a particular Web 2.0 service API) and secondary connector (the one providing its functionality by calling other connectors) will be used.

Using this terminology, a meaningful Web 2.0 platform instance has to consist of a core, at least one primary connector and an arbitrary number of secondary connectors. Figure 4.3 shows an example of such a configuration.

- 81 - Visual Paradigm for UML Standard Edition(Masaryk University)

<> Web 2.0 Platform

<> <> Client application Core

<> <> Secondary Connector Secondary Connector

<> <> <> Primary Connector Primary Connector Primary Connector

API API API

<> <> <> Web 2.0 Service Web 2.0 Service Web 2.0 Service

Figure 4.3: Example Web 2.0 Platform architecture (logical component diagram)

4.2.1 Platform core

The core is a central part of the platform, although it provides no application logic on its own. It has to be designed to be generic and extensible so that it can be used with any connector set and client application without modifications. Connectors, although being developed and built independently, become parts of the platform once deployed and shall communicate with the platform core natively for performance reasons. The compatibility between the platform core and connectors shall be achieved by adhering to simple predefined interfaces on both sides. Client applications, on the other hand, are external to the platform and as little as possible should be assumed about them. As the platform is targeted primarily to enterprise environments, the Web services technological stack (HTTP/SOAP + WSDL) is a natural choice for achieving this. Employing the Web services technology for the client application API also naturally addresses the S1 and U1 non-functional requirements.

The platform core is responsible for ensuring all the non-functional requirements. This includes usability, reliability, performance and supportability as specified in Section 4.1. In addition, it is involved in addressing all the functional requirements as well. Specific use-cases for the platform core (see Figure 4.4) are derived from the overall Web 2.0 Platform use-case diagram (see Figure 4.1) by elaborating individual use-cases and excluding connectors, which, from the point of view of the platform core, can be considered external actors. To recapitulate, platform core use-cases cover the following functional areas: • user account management (which may or may not be employed in the end, F4); • management of users’ credentials for accessing Web 2.0 services (F3); • user authentication (F5); • mediation of service operation requests and responses between client applications and connectors (which together with the concept of connector addresses F1); • publishing information about available services and their operations (F2).

- 82 - Visual Paradigm for UML Standard Edition(Masaryk University)

Web 2.0 Platform Core

Authenticate user

Manage connectors Manage user accounts W2P Administrator

Manage user's <> Set user's credentials credentials

<> Get user's credentials <> <> Client application Get the description of required credentials

Announce user logout

<> Get the list of all users W2P Connector

Call service operation

Extension Points <> Notify Notify

Get WSDL of a particular service

Get the list of available services Client application developer

Figure 4.4: Web 2.0 Platform core use-case diagram

4.2.1.1 User account management

The platform core has to be able to provide both client applications and connectors with the list of its users to facilitate some specific application logic, e.g. team building. However, read-only access is sufficient to achieve this and user account management may be external to the platform as long as the platform core is able to get the information it needs. Still, to be able to work on its own, the platform offers also its own simple user account management facility. When enabled, the platform core allows users to manage their accounts, which is not the case with an external user account control in place.

Almost every serious multi-user application has to differentiate between multiple types of users. To facilitate this, the platform core takes user roles into account. A user can have multiple roles assigned and the assignment takes place during the user authentication procedure (see Section 4.2.1.3). Although user roles can be used for arbitrary purposes inside connectors, their primary purpose is authorization.

- 83 - Visual Paradigm for UML Standard Edition(Masaryk University)

User Role LOGIN Connector AUTH_TYPE NAME has NAME SECRET NAME

defines

provides

UserDataRequest ID TYPE UserData VALIDATION VALUE is satisfied by PROMPT

Figure 4.5: Logical ERD for user account management and management of users’ credentials

4.2.1.2 Management of users’ credentials for accessing Web 2.0 services

Primary connectors need to obtain some data from each user to be able to access the particular Web 2.0 service API on behalf of them. Typically, this includes login and password for accessing the Web 2.0 service, but other or more information may be required as well (e.g. the BibSonomy bookmarking service uses a generated API key instead of a password). Described from the point of view of the platform core, the management of users’ credentials for accessing Web 2.0 services involves several issues to be addressed.

First, the platform core has to be able to inform client applications about the kind of data, each primary connector needs to obtain from a user to be able to work for the user. Client applications will in turn be able to prompt users to provide such data. So the platform core needs to be aware of the needs of each primary connector. This is ensured by the connector registration procedure (see Section 4.2.1.7).

Second, users have to be able to provide and possibly update the data, required by the primary connectors, through a client application. This is not a big design issue, but it is a very delicate phase from the privacy point of view. It is mandatory that transfer of the credentials is secured and that platform users use just trusted client applications when supplying their credentials to the platform core. Otherwise, a malicious client application may disclose the credentials. To facilitate updating the credentials, client applications are allowed to retrieve user’s credentials from the platform partially. For security reasons, only credentials not described as password-like items should be revealed to client applications. Typically, a login can be read by a client application, but not a password or an API key.

Third, primary connectors need to be able to obtain the credentials they need to act on behalf of the user who invoked the operation during processing an operation request.

4.2.1.3 User authentication

User authentication is a crucial factor for ensuring trust and privacy. With respect to security, client applications are considered untrustworthy. Under certain settings, one can limit access to the platform to known and trusted client application(s) e.g. by setting up a network firewall. But generally, the platform is meant to be open and freely accessible so it has to be designed so that one cannot break its security using a malicious client application. Connectors, on the other hand, are considered trustful because they are internal to the platform and added by the platform administrator only.

- 84 - User authentication has two purposes. Firstly, the platform core serves as an authentication engine for multi-user client applications. When a user tries to log in to a multi-user client application, it should not accept or refuse the attempt on its own. Instead, the client application should initiate an authentication process involving the platform, and accept the authentication result, announced back by the platform core. This simplifies client applications, permits employing various user authentication mechanisms and even permits employing different authentication mechanisms for different users.

Secondly, it serves for authenticating incoming invocations of sensitive operations, i.e. service operation calls and calls related to credentials management. Each incoming sensitive operation request has to be authenticated to preclude impersonation and protect the platform against malicious client applications. Note that the above-mentioned user authentication service for client applications has in fact no effect on the security of the platform. If a malicious or an incorrectly written client application allows a user to log in without or despite of the decision of the platform, the user, although logged into a client application, will still not be able to invoke any sensitive operation in the platform.

Authenticating incoming invocations of sensitive operations has to be fast enough not to delay the operation processing and shall not include any additional user interactions. Basically, the platform core shall just verify the credentials, attached to an incoming request. Either on its own or by employing an external authentication service such as LDAP. On the other hand, the initial authentication process, performed during user login to a client application, can be more demanding. This is where remote, 3rd party authentication systems such as OpenID may come in. The only requirement is that the authentication process, although being initiated by the client application, allows the platform core to verify reliably whether the authentication succeeded or not.

Based on this rationale, the platform core shall support two distinct authentication methods. Note that in both cases, subsequent platform requests are authenticated uniformly by using a user login and a secret, be it a user-provided password or a generated token: • Direct authentication, where a client application retrieves authentication data (login and password) from the user, verifies them with the platform and when accepted, uses themto authenticate subsequent platform requests. • External authentication, where a client application just initiates an authentication process involving an external authentication provider and a platform core. As a result of this process, the authentication either fails or, when successful, the platform core verifies the success and eventually reveals to the client application the user’s login and a generated token for authenticating their subsequent platform requests (see Figure 4.6 for a typical message flow).

Authentication is closely related to user account management. External authentication makes a little sense when an external user account management mechanism providing also authentication facilities, such as LDAP, is in place. However, both authentication methods make sense in combination with the local user database (see Section 4.2.1.1). In such a case, users can be authenticated either directly or using a 3rd party authentication provider such as OpenID.

- 85 - Visual Paradigm for UML Standard Edition(Masaryk University)

Web-based client application W2P Core

User <> 1: Login using OpenID OpenID provider 1.1: Authentication request

opt

[user not yet authenticated in OpenId provider]

2: User authentication request

3: User authentication response

4: Authentication assertion

opt

[positive assertion]

5: Verification

6: Verification response

6.1: Authentication result

6.1.1: Authentication result

Figure 4.6: External user authentication (using OpenID)

A separate issue, but still related to user authentication is logging users out. It is natural that users should finish their session with a multi-user client application by logging out of it. But ideally, there should be no need to log users out of the platform. This holds for the platform core, which is completely stateless. Unfortunately, for connectors this is not the case. The reasons for this will be discussed later in Section 4.2.2.2, but in brief – primary connectors may want to keep some cached data. To prevent memory wasting, the cached data should be discarded as soon as they are no longer needed. Announcing user logout to all the interested connectors so that they can clean their caches is an important part of this cleanup process (together with time-based methods).

4.2.1.4 Request processing

Each incoming request specifies a target service name and a target operation name. First, it is processed by the platform core. In some specific cases, namely authentication, user account management and management of users’ credentials, the platform core processes and answers the request on its own. But for most requests, the platform core just takes a role of the router and forwards the request to an appropriate connector to be answered (see Figure 4.7).

Generally, each request processing may result either in a successfully formulated and sent out response or in a fault. Faults may be of various kinds, may have various causes and may be detected at various stages of request processing. To name just the most common pitfalls: • badly formulated or logically incorrect request; • erroneous request processing in the platform core or in a connector; • missing or incorrect credentials for accessing a Web 2.0 service API; • unreachable or unresponsive Web 2.0 service API.

- 86 - Note that faults can occur also due to aspects, which are external to the platform and completely out of its control, such as a network connection with a remote Web 2.0 service API. For this reason, every operation without an exception can possibly fail and this fact has to be clearly declared in the description of each operation.

When a fault occurs, it is up to the platform core to deal with it. In most cases, there is no reasonable recovery and the platform core can just deliver a notification back to the client application. However, there is one exception to this rule – when a Web 2.0 service API is unreachable or unresponsive, it may possibly help to retry the operation a while later.

4.2.1.5 Notifications

Under certain circumstances, an operation or a couple of operations need to be invoked in reaction to a successful invocation of another operation. Typically, this is the case of initialization and finalization procedures. Consider, for example, a project management application built using the Web 2.0 platform. In such an application, it may be necessary to perform some initializations in various connectors in reaction to a creation of a new project – web site, calendar, shared file storage, discussion forum and possibly other items should be created together with the project. At first sight, this issue may seem addressed by the concept of a secondary connector, introduced in Section 4.2. Although it is certainly a possible solution, creating a secondary connector for founding and removing projects together with all the related items is not a good design. Such a secondary connector would have to be changed each time a new project-related functionality is added or removed. To address this issue, the concept of notifications was introduced to the platform. It allows the dependency to be configured the other way round by introducing a publish&subscribe communication model with permanent subscriptions. Each connector, interested in creating or deleting a project, can register itself as a listener to these events during the connector registration procedure and it will be notified by the platform core when they happen (seeVisual Figure Paradigm for 4.7). UML Standard Edition(Masaryk University)

W2P Core : Connector : Connector

<> Client application

1: ref 1.1:

1.2: Process Operation

opt

[operation processed successfully]

loop

[for each listener] ref 2: notify() Process Notification

3:

Figure 4.7: Processing a connector request with notifications

- 87 - 4.2.1.6 Publishing service descriptions

Web 2.0 platform is meant to be open in a sense that anyone can build a client application communicating with it. Client application developers are therefore not supposed to be able to access the platform in any other way than through its client interface. This makes publishing a list of available services and descriptions of their operations a crucial part of the platform interface, enabling development of client applications.

In Section 4.1, organization of the platform-provided application logic into services and operations was presumed. This makes sense from a client application point of view. But inside the platform, application logic is delivered by connectors. Therefore, a natural mapping between services and connectors is used – a set of operations provided by a single connector will be published as a set of operations of a single service for client applications. In addition, there are also a few services provided by the platform core itself (authentication, user account management and management of users’ credentials).

To achieve this, the platform core needs to be aware of all the connectors as well as of the details of operations, provided by each connector. This is accomplished through the connector registration procedure (see Section 4.2.1.7). For the reasons mentioned in Section 4.2.1, WSDL is used for describing the services and their operations. An advantage of WSDL is a clear division of a service description into abstract and concrete definitions. This allows connectors to remain deployment-agnostic by providing just the abstract definitions of their interface (types, messages and a porttype). The platform core completes the description by adding concrete definitions, reflecting the deployment of the platform, before it is published.

By convention, WSDL description of a remote web service should be published in a specific URL and accessible by a HTTP GET request. This facilitates creation of client applications in common IDEs, which can generate the service client code automatically in such a case.

4.2.1.7 Connector management

The platform administrator should be able to add and eventually remove connectors as needed. But the platform core needs to be aware of the current connector set for several reasons. Most of them were already mentioned: • to be able to provide client application developers with the list of all available services together with the description of their interfaces; • to be able to provide client applications with the descriptions of the credentials required by current connectors; • to be able to announce user logout to the interested connectors; • to be able to realize connector notifications.

To achieve this, any connector has to be registered with the platform core before it is put into operation and the registration has to be cancelled before the connector is removed. During the registration procedure, the following information has to be provided to the platform core: • a connector identification; • abstract WSDL; • requirements for user-provided credentials; • required notifications; • whether it is interested in user logouts or not.

- 88 - 4.2.2 Connectors

As already mentioned, a connector is a pluggable component making it possible to satisfy the F1 requirement by providing a specific piece of application logic to the platform and subsequently to its client applications. The overall scope of the functionality, provided by a platform instance, is determined by its connectors. This makes Web 2.0 platform generic in a sense that its instances can provide virtually any kind of service level functionality to its client applications.

The original idea behind connectors is that they are just proxies for accessing external Web 2.0 services. Although they can do some necessary pre- or post-processing, they deliver their functionality primarily by calling external services. But for the platform to be useful as a part of a service-oriented environment, it shall be able to offer also higher-level, coarse-grained functional constructs. To address this issue, the concept of a secondary connector was introduced and the original connectors were given the “primary” label. Secondary connectors are higher-level components, delivering their functionality by calling another connector or even multiple connectors and processing the results using built-in orchestration logic.

The division of connectors into primary and secondary ones is of course of no importance for both the platform core and clientVisual applications. Paradigm for UML Standard The Edition(Masaryk interface University) between a connector and the platform core remains the same not to introduce any difference between calling an operation provided by a primary and a secondary connector. Another benefit of this uniformity is that secondary connectors need not make use just of primary connectors. As primary and secondary connectors are invoked exactly the same way, there is nothing preventing a secondary connector from calling another secondary connector(s).

Web 2.0 Platform Primary Connector

Process operation

Process notification <> Web 2.0 service

<> W2P Core Announce user logout

Figure 4.8: Web 2.0 Platform primary connector use-case diagram

To simplify connector development and address the U2 requirement, functional requirements imposed on connectors are kept at a necessary minimum. Figure 4.8 shows use cases for a primary connector. For secondary connectors, the only difference is in the actor on the right-hand side of the diagram. Another connector is there instead of a Web 2.0 service.

The only hard requirement imposed on a connector is that it has to be able to process the operations declared in the description of its interface. For primary connectors, this includes interacting with a remote Web 2.0 service API. Depending on the design of a Web 2.0 service and a connector APIs, processing a single operation by the connector can result even in a whole series of Web 2.0 service API invocations. Typically, the originating user’s credentials are required for accessing a Web 2.0 service API. Provided that it was entered by the user, primary connectors can retrieve the credentials from the platform core (see Section 4.2.1.2). A primary connector shall be prepared also for missing or invalid credentials and fail gracefully in such a case.

- 89 - Visual Paradigm for UML Standard Edition(Masaryk University)

: Connector W2P Core

<> Web 2.0 Service 1:

opt

[credentials not already known]

1.1: request credentials

1.2: credentials

loop 1.3: service API request

1.4: service API response

1.4.1:

Figure 4.9: Processing an operation in a primary connector

This is the case of a primary connector. What about secondary connectors? When a connector is registered with the platform core as a notification target, it has to be able to process incoming notifications (see Section 4.2.1.5). After identifying the event that caused a notification to be sent, processing of the notification typically means just invoking one or more of the published operations with specific parameters. Although in theory both primary and secondary connectors can be registered as notification targets, it is much more typical for secondary connectors. Primary connectors should be kept as generic as possible and application-specific functionality should be realized in secondary connectors.

Lastly, when a connector indicated that it is interested in user logouts, it has to be able to accept announcements thereof (see Section 4.2.1.3). Whether user logouts are of interest for the connector or not depends on the design of the connector. Typically, this is the case of primary connectors, which may want to employ caching techniques for limiting the number of interactions with the remote Web 2.0 service. Either for performance reasons or because of the usage limitations imposed by some Web 2.0 service APIs. To preclude memory wasting, cached data should be discarded when they are no longer needed. A user’s logout is a good indicator of such a situation.

4.2.2.1 Impersonation

Applications built upon the platform typically store most of their data in remote Web 2.0 services. This raises a fundamental question – what service account should be used for storing a particular piece of data? Answering this question is a crucial step in designing any application backed by Web 2.0 services. Unfortunately, there is no single solution, optimal for all cases. The right solution depends primarily on the level to which the capabilities of the underlying Web 2.0 services meet the needs of the implemented application logic.

Theoretically, using just the account of the current user, i.e. of the user who initiated the operation, could be enough. But in reality, one can easily get to the situation that the underlying Web 2.0 service does not fully satisfy the needs of the implemented application. Access rights are a typical example. It is quite usual in many present applications that multiple users have full rights to some object. However, this is

- 90 - quite rare in services considered during the work. Typically, although an object owner can grant write access to another users, the rights of those users are still somewhat limited when compared with the rights of the original object owner (e.g. they cannot delete the object, manage its sharing settings or un- share it with the original owner). Data access settings are also quite coarse-grained in current Web 2.0 services. Typically, just two pre-defined levels with service-dependent detailed semantics are available (reading and writing) which may not fit the needs of the particular application.

For these reasons, secondary connectors may need to be able to invoke an operation of another connector with an identity which differs from the identity of the current user. Although it may seem controversial from the security point of view, note that there is no way to enforce the impersonation from a client application and that connectors are considered trusted parts of the platform (see Section 4.2.1.3).

4.2.2.2 Design considerations

The design of individual connectors cannot be enforced in any way. It is completely up to their authors. As long as a connector satisfies the functional requirements and it is registered with the platform core, it can be used in the platform. However, there are several general points worth noting.

Utilising a remote Web 2.0 service API inevitably impacts the performance of any primary connector. Predictably, interaction with a remote service over the network will be by far the slowest part of the processing. In addition, many Web 2.0 services oblige their client applications not to overburden the service, or even impose some hard limits on the usage of their API (see Section 3.4.1). The number of interactions with Web 2.0 services should be therefore kept as low as possible. Generally, the number of necessary interactions with a Web 2.0 service API can be lowered by making it possible to process some requests locally by caching. For true REST-based APIs, a simple transparent network-level caching can be employed without any impact on the connector. But more often than not, the caching logic has to be based on a deep understanding of the semantics of individual service API requests and it has to be done by the connectors themselves. Connectors that perform any kind of caching should be interested in user logout evens, because it allows them to safely clear the cached data, associated with a given user.

When designing a connector interface, one should consider optimal granularity of its operations. Generally, fine-grained operations can be always composed into higher-level, coarse-grained operations using a secondary connector. The opposite process is much harder since it requires redesigning an existing connector and possibly splitting it into two separate connectors. Therefore, it is recommended to design operations of primary connectors to be relatively fine-grained. This yields in better reusability of the primary connector and higher-level construct can be added by means of secondary connectors as needed. In addition, it is highly recommended to design interfaces of primary connectors so that they consist of generic operations, arising from the nature of its application domain, instead of simply mirroring the API of the underlying Web 2.0 service. Keeping the connector interface generic makes possible future replacement of the underlying Web 2.0 service a local change, not affecting the rest of the system.

As already mentioned, secondary connectors can make use of other secondary connectors. However, building an overly complex connector dependency structures this way is not recommended. A strictly hierarchical structure of secondary connector dependencies is recommended. It should be sufficient in most cases and precludes introducing unexpected, hard-to-manage or even cyclic dependencies.

- 91 - 4.2.3 Design

So far the description of the platform was kept almost technology neutral. The only technology-related decision made up until now referred to external servers, proposed by the microkernel architectural pattern. It was decided that instead of having multiple external servers, the platform will offer just a single external server and that it will use the Web Services technological stack to ensure a high level of compatibility and easy development of client applications. This section introduces a technological basis the platform is built on.

Employing an Enterprise Service Bus (ESB, see Section 2.3.2.3 for details) was presumed already in the research question formulated in the very beginning of this thesis. But it turns out not to be a bad choice anyway. The platform and especially its core can benefit from the features, ESB implementations usually provide. Namely from its transport protocol conversion, message routing, message transformation, security, monitoring and management capabilities as well as from its pluggable nature. Employing an ESB facilitates fulfilling some functional requirements (e.g. F5 and F6). In addition, some non-functional requirements can be satisfied simply by configuring the ESB environment (e.g. R1).

Client application Client application #1 ... #i

Security

ESB

Secondary Secondary Primary Primary Core connector #1 ... connector #n connector #1 ... connector #m

Web 2.0 Web 2.0 service #1 ... service #m

Figure 4.10: ESB-based design of the Web 2.0 platform

As already mentioned in Section 2.3.2.3, ESB is just an abstract concept and various implementations were developed to date, mostly as proprietary, closed-source commercially licensed software. For realizing an ESB inside the Web 2.0 Platform, the Java Business Integration technology (JBI, see Section 2.3.2.3 for details) was selected. The selection was motivated by two main reasons – firstly, JBI is a stable standardized technology and secondly, mature open-source implementations of the specification exist. JBI itself provides us with a standardized message format, message routing facilities and hot- deployability of individual components. In addition, it provides a message-level security through the Java Authentication and Authorization Service (JAAS), a generic pluggable Java security framework (Oracle, 2010).

Basic building blocks of any JBI-based application are service units (SU). Each service unit is targeted to a specific JBI component, which provides a host environment for the SU and determines the realization of the SU. JBI differentiates two kinds of JBI components – binding components and service engines. Binding components are gateways for communicating outside the ESB. Their task is to transform a protocol-specific communication (e.g. HTTP, FTP, JMS, JDBC, CORBA,…) to XML-based JBI

- 92 - messages and vice versa. Service units targeted to binding components usually just configure the built- in functionality of the binding component. On the other hand, service engines are used for providing some application logic inside the ESB and service units deliver that logic in a component-specific way (e.g. Java bean, XSLT, BPEL, rules,…). JBI adopts the service model defined in WSDL 2.0. Each SU defines one or more services, each implementing one or more operations.

The platform core consists of several services, each implementing a specific subset of the functionality described throughout the Section 4.2.1. There are services formulating answers to the incoming requests on their own as well as services just modifying the requests and/or responses before passing them on in the spirit of the “Pipes and Filters” architectural pattern (Buschmann, 2007). Namely: • UserDataManager, responsible for managing user credentials; • UserAccountManager, responsible for managing user accounts; • ConnectorManager, responsible for providing the information obtained during connector

Visual Paradigmregistration; for UML Standard Edition(Masaryk University) • Router, responsible for routing incoming messages and notifications to appropriate connectors and applying an appropriate redelivery strategy to satisfy the R1 requirement; • Notifier, responsible for invoking notifications; • WSDLCompleter, responsible for completing abstract WSDL descriptions provided by connectors before they are published.

<> <> ConnectorManager <> UserDataManager UserAccountsManager +getConnectorList() <> <> +getUserDataForConnector() +getWSDL() +createAccount() +getUserData() +getUserDataRequests() +getAllUsers() +updateUserData() +getNotificationSubscribers() +login() +getLogoutConnectors() +logout()

<>

<> <> <> WSDLCompleter Notifier Router

Figure 4.11: Web 2.0 platform core services

In addition, there are services responsible for exposing the published API to client applications and their developers. As already mentioned, communication with client applications is realized by the SOAP protocol, passed over HTTP. In fact, encrypted HTTPS protocol is used to provide the desired level of confidentiality (F6) as advocated by WS-I. Information for development of client applications is available using plain HTTP GET requests (see Figure 4.12).

- 93 - Visual Paradigm for UML Standard Edition(Masaryk University)

<> <> WSDLCompleter ConnectorManager Client application developer WSDLProvider

1: HTTP GET 1.1: 1.1.1: getWSDL(connectorName)

1.1.2: abstract WSDL 1.1.2.1: full WSDL 1.1.2.1.1: full WSDL

Figure 4.12: Retrieving the description of a connector interface

The most frequent, most important and most complicated message flow is naturally that of calling a connector operation. Parts of it were already described in Figure 4.7, emphasising notifications, and Figure 4.9, emphasising communication with a remote Web 2.0 service. Now, processing connector Visual Paradigmrequests for UML Standard inside Edition(Masaryk the University) platform core can be described in more details as shown in Figure 4.13.

<> <> : Connector : Connector <> <> Notifier Router UserDataManager ConnectorManager <> ConnectorCall Client application 1: SOAP/HTTPS 1.1: 1.1.1: 1.1.1.1:

opt 1.1.1.1.1: getUserDataForConnector()

1.1.1.1.2: credentials

1.1.1.2:

1.1.1.2.1:

opt

[operation processed successfully] 2: getNotificationSubscribers()

3:

loop

[for each subscriber] 4: 4.1: notify()

5: 5.1:

Figure 4.13: Processing a connector request inside the platform core

Individual connectors have to be designed so that each connector exposes a single JBI service, implementing all the operations of its declared interface and optionally also the operations for

- 94 - processing notifications and user logout announcements. Every connector is identified uniquely by its name, which has to be also the name of the associated JBI service. JBI services of all connectors share the same namespace, identified by the URI “http://www.fi.muni.cz/web2platform/connectors”. This prevents existence of two connectors with the same name within a single Web 2.0 Platform instance and facilitates routing messages to connectors because the full JBI service identification can be derived from a connector name.

Note that user authentication and authorization are not mentioned in this section. The reason for this is that neither of these is realized as a service in JBI environment and the realization is specific to every JBI container. Connector registration and deregistration are missing for a similar reason. These operations are triggered by connector deployment and undeployment events, which is out of the scope of the JBI specification and therefore also container-specific.

4.3 Implementation

At the time of writing this thesis, there are three stable open-source products, implementing the JBI specification – OpenESB, Petals ESB and Apache ServiceMix. For a detailed comparison of these three JBI implementations, see (Svoboda, 2009). In short, Petals ESB is clearly lagging behind the others in documentation, number of available components and activity of the community. OpenESB and Apache ServiceMix take rather different approaches and a simple general verdict can hardly be made. Probably the biggest advantage of OpenESB over ServiceMix is the tooling support, integrated to the popular NetBeans IDE, contrasting with the Maven-based development used by ServiceMix. On the other hand, the biggest disadvantage of OpenESB when compared to ServiceMix is the lack of message routing facilities. This was the primary motivation for selecting Apache ServiceMix JBI container for prototype implementation of the Web 2.0 Platform. Later on, this decision has proved to be right also because of the stagnation in OpenESB development after its acquisition by Oracle in January 2010.

As mentioned in the previous section, not all the issues raised during the Web 2.0 platform analysis and design are addressed by the JBI specification. Especially some security aspects and connector (de)‌registration. This section describes dealing with these issues in the Apache ServiceMix environment.

Connector registration and deregistration can be handled in ServiceMix by listening to lifecycle events. More specifically, service unit deployment and undeployment events are of interest here. During deployment of each service unit, the listener checks its content and performs registration of all connectors found. Similarly, the listener deregisters connectors included in a service unit during its undeployment. Connector registrations are stored persistently not to be lost when the JBI container is stopped.

When implementing a connector, one has to include the connector registration information in the connectors. file, stored in the service unit root directory. Otherwise, there are no limitations for the internal structure or the technology used, which addresses the S2 requirement indirectly. Out of the box, ServiceMix is able to run the code written in plain Java, deployed to servicemix-bean JBI component, and in Java-enabled scripting languages (JSR 223), deployed to servicemix-scripting JBI component. In secondary connectors, services can be orchestrated using a declarative notation of BPEL. In cases where the descriptive power of BPEL is not sufficient, a full strength of the Java language can be used again.

- 95 - 4.3.1 Security

JBI relies on JAAS definitions for security features. Binding components are expected to transform protocol-specific credentials to a JAAS security subject and inject it into the normalized message. But this is of course not enough for securing a JBI-based application. Specifically for the Web 2.0 platform, security aspects include: • securing the communication between the platform and client applications; • user authentication; • user authorization; • securing the stored credentials.

Communication between the platform and client applications is secured using HTTPS protocol, which is a standard approach for securing web services, recommended by the WS-I initiative. Technically, this is achieved by configuring the services deployed to the servicemix-http binding component.

User authentication takes place in services communicating with client applications and includes several techniques as described in Section 4.2.1.3. From a client application point of view, both the direct authentication and the authentication of further requests are realized by the HTTP basic authentication scheme as described in RFC 2617. In ServiceMix, this includes configuring the services deployed to the servicemix-http binding component, providing a custom JAAS login module and updating the JAAS login configuration file. Custom JAAS login modules are the way to satisfy the F5 requirement. It allows one to easily employ LDAP, Kerberos or other common authentication methods. Support for external authentication cannot be elaborated in any way, because it is determined completely by the authentication mechanism in question. Currently, support for OpenID authentication scheme is implemented using a specific HTTP endpoint connected to a specific JBI service.

Regarding authorization, ServiceMix is flexible enough to support custom authorization strategies. An authorization strategy defines which users or user groups can access which services or which operations on a service. This way, one can easily restrict access to a particular service or a service operation when needed. In Web 2.0 platform, authorization rules are to be configured manually by the platform administrator. Another possibility would be to include the authorization settings in connectors and load it during connector registration, but it would affect the reusability of individual connectors seriously.

Securing the credentials provided by the platform users is the key security issue and deserves a special attention. Disclosing the credentials would not only compromise the security of the particular Web 2.0 platform instance and the application(s) it is running, but it could also discourage users from using and supporting Web 2.0-enabled systems in general. Therefore, the user-provided credentials are stored using the “Derby” Java-based database engine, bundled with ServiceMix. The Derby database engine has several uncommon features, increasing the security of the data stored therein. Firstly, when being run in the embedded mode, which is the case in ServiceMix, only the code running inside the same JVM can connect to the database. This practically eliminates the possibility to compromise the running database, because the malicious code would have to be run inside ServiceMix. Secondly, to protect the data stored on disk, Derby supports encrypting the data using a variety of reliable algorithms (e.g. 3DES, AES, Blowfish).

- 96 - 4.4 Deployment

Deployment of the Web 2.0 platform is determined by the selected technological basis. As a JBI-based application, it is deployed to the JBI container as a set of service units packaged in service assemblies. Service assemblies are just logical containers for the deployment of service units, so the platform core is packaged in a single service assembly. When the platform core is deployed successfully, connectors can be added or removed as needed. Whether connectors are deployed using a single large service assembly or whether they are split into multiple service assemblies is of no importance for the platform core.

The platform core needs to store some data persistently (see Figure 4.5 for a logical ERD). As already mentioned, ServiceMix JBI container is distributed together with the “Derby” relational database engine, so it is used for this purpose. However, any JDBC-enabled relational database could be used as well.

Optionally, reliability and performance of the platform can be improved by using multiple cooperating instances of the JBI container. ServiceMix supports both high availability, i.e. automatic switching to another identical container instance in case of an abnormal termination, and clustering, i.e. utilising multiple interconnected physical computers for running a single logical ServiceMix instance to share and balance the computational load. These two features are completely independent so they can be used either separately or together as needed.

By default, ServiceMix uses an in-memory flow, based on the Staged event-driven architecture (SEDA) as introduced by Welsh (2002), to convey messages on the bus. The SEDA flow provides a maximum performance, but ServiceMix supports also JMS- and JCA-based flows, each providing some specific additional features. In an enterprise environment, the JMS flow, where a persistent message queue is used for each JBI endpoint, may be an optimal choice. It increases robustness as it allows ServiceMix to continue doing its work from the point where it has been interrupted in case of an unexpected failure. This Visualis important Paradigm for UML Standard especially Edition(Masaryk University) for platform instances involved in asynchronous communication with other enterprise systems.

<> App Server

<> Web Apache Servicemix Client Web 2.0 Service <> <> <> Client application Web 2.0 Platform :: Core <> <> Web 2.0 Service

<> Web 2.0 Platform :: Connectors Web 2.0 Service

<>

<> Db Server

Relational database

Figure 4.14: Basic Web 2.0 platform deployment

- 97 - 4.5 Choosing services for integration

Previous sections described the proposed framework itself. But no framework is useful on its own. Every software framework is just a reusable abstraction providing a generic functionality which needs to be extended with a specific functionality to be useful for end users. In the Web 2.0 platform, this is the purpose of connectors.

Although multiple connectors can be deployed in a single package and dependencies between connectors may exist, every connector is primarily a self-contained binary component with its own analysis, design, implementation and deployment phases. The interface of every connector has to meet a few requirements for the platform core to be able to work with it. These requirements stem from the design of the framework and were already mentioned in the previous sections, particularly in Section 4.2.2. On the other hand, there are no restrictions imposed on the way a connector implements its interface.

As already mentioned, connectors need not necessarily employ a remote service to deliver their functionality. But still, such a case is the primary concern of this dissertation and the proposed framework. When developing a new connector built upon a Web 2.0 service, selection of the underlying service is by far the most important design decision for the rest of the development process. Selection of an improper service can even cause the overall failure of the implementation. This section aims to reduce the number of such failures by providing some general notes and guidelines for the developers being in charge of selecting the Web 2.0 service to be used as a basis of a new connector that should extend a platform instance with a given functionality.

4.5.1 Functional criteria

Connector development should always start with the requirements gathering phase where one clarifies the functional requirements for the connector. Based on these requirements, one can make up the initial list of candidate Web 2.0 services to be used. Various ways can be used for finding the candidate services, but the ProgrammableWeb7 service directory has proven to be a very useful starting point. It eliminates going through services without any public API and provides a high-level categorization of the services according to the provided functionality. In addition, each service record contains direct links to important resources such as the API documentation, which saves a lot of time compared with browsing the services’ web sites manually.

After finding the candidate services, one should investigate them one after another and check whether the functionality of the service really matches the needs of the connector or, more precisely, whether the provided functionality makes it possible and simple enough to implement the functionality of the connector. Even if this is the case, one should not forget to check also the service API documentation for the list of available operations. More often than not, the functionality available through the API is just a subset of the overall functionality of a service, e.g. Facebook has no API method for creating a user group and Slideshare does not allow editing previously specified tags via its API although both these operations are available in the services’ GUIs. Occasionally, one can also come across the opposite – Blogger allows submitting more sophisticated search queries via its API than through its web interface. In addition to the basic functionality, one should pay special attention to specific features such as data organization and data sharing. Many different approaches are used in these areas which can but need not fit the needs of the connector and the application the connector is to be part of.

7 http://www.programmableweb.com

- 98 - 4.5.1.1 Data organization

Various data organization schemes are used by Web 2.0 services. In some cases, the organization is almost given by the nature of the problem or by the way a specific kind of data is usually perceived (e.g. files are traditionally organized into folders, which can be nested to create a tree-like structure). But in most cases, several reasonable approaches to organizing data items are possible: • by putting them into containers, which are not subjects to further organization (e.g. calendar events and calendars in the Google Calendar service); • by putting them into containers, which can be nested (typical for files and folders); • by labelling them with tags (e.g. bookmarks in the BibSonomy service); • by both putting them into containers and labelling them with tags (e.g. files and folders in the DigitalBucket service).

This variety is not an issue when employing a single Web 2.0 service for a specific purpose. However, in more complex scenarios where multiple Web 2.0 services are used together, the combination of various approaches in one system can be confusing for both application developers and end users. Adhering to an unusual data organization scheme can also make it difficult to replace the backing service with another one. This adds the data organization scheme to the list of aspects, one should consider when selecting the right Web 2.0 service. Obviously, one can always decide not to make use of all the possibilities a particular Web 2.0 service API provides. But one can hardly simulate tagging in a service, supporting just the container-based data organization.

4.5.1.2 Data sharing

Various data sharing models can be seen in current Web 2.0 services. This is of importance especially for collaborative environments because any collaborative software needs to implement some kind of data sharing among individual users of the system. The problematic of data sharing is not new and it was elaborated thoroughly as a part of the Computer supported cooperative work (CSCW) research field, for example in (Greif, 1987). Approaches to data sharing can be classified according to many criteria: • whether sharing applies to individual data items, containers or both; • whether data can be shared with individual users, named groups of users or both; • whether a data item can be shared with other users for reading, writing or both; • whether a data item ownership can be transferred to another user; • whether a data item can be fully managed by multiple users or there is just a single owner at a time; • whether shared objects are distinguishable from private objects or not; • whether users can see who shared the particular data item with them or not.

Surely, this list is not complete. But even the mentioned criteria result in a surprisingly large number of data-sharing settings, each slightly different from any other, making it hard or even impossible to build one approach on top of another one. This is why these criteria are important for evaluating the suitability of the particular data sharing model for a particular purpose. What is more, different data sharing approaches result in different APIs and for the same reasons like with data organization schemes, combining multiple approaches in a single environment is not advisable.

It is important to note that data sharing can be in fact realized also with services that do not support it on their own. Of course, when a Web 2.0 service, used for storing the data, offers some kind of data sharing through its API, the easiest way is to make use of this functionality. But even if this is not the case, each piece of shared data can still be stored in some service user account and the platform can mimic the sharing, i.e. let the authorized platform users access it.

- 99 - No matter if an underlying service supports data sharing or not, four general reasonable solutions for storing the data shared by a group of platform users in remote service accounts exist. The data can be: • spread among the accounts of group members (e.g. each data item in the account of its creator); • stored in an account of a single designated group member (e.g. the group founder); • stored in a group-specific account, i.e. in an account with multiple owners; • stored in a global, system-wide account.

All these possibilities are perfectly feasible from the technology point of view. However, when an underlying service does not support data sharing, one may easily come into conflict with the service’s ToS, which become a major limiting factor here. In all the options except the group-specific accounts, users need to be able to access and modify the data, stored in an account they are not owners of. And using a group-specific account suffers from other problems – it requires the service to permit an account to have multiple owners and, the other way round, that a person can be in ownership of multiple service accounts. In addition, using group-specific accounts is very uncomfortable if the service does not allow creating new accounts programmatically because a new service account has to be created for each user group. For all these reasons, implementing data sharing in a Web 2.0 service which does not provide such functionality on its own, is problematic and more often than not virtually impossible.

Data sharing is inevitably related to data access authorization. It is important to note that any data access restrictions cannot be enforced just by the platform. The underlying service has to be able to apply the same or possibly more restrictive data access rules too. Otherwise, users would be able to bypass the restrictions applied by the platform by accessing the restricted data directly through the underlying service GUI. What is more, reliability of a platform-based system would be seriously jeopardized if the shared data would be stored in accounts, users have full access to. In such a case, a user could easily break the functionality of the system by modifying the data structures and configuration, managed and expected by the particular connector – either by mistake or intentionally. For this reason, storing the shared data in a global system-wide account and providing users just with the rights they really need is the most robust data-sharing setting, recommended whenever possible.

4.5.2 Non-functional criteria

There is a whole range of non-functional criteria, one should consider when choosing a proper Web 2.0 service for implementing a connector. From service usage limitations over API design and service level offerings to pricing model. The following list is based primarily on the findings of the analysis of current Web 2.0 services whose detailed results were given in Chapter 3.

First of all, one should consult the service’s terms of usage to ensure that both the service and its API can be used for the intended purpose and in the intended way. More specific guidelines can hardly be provided here because of the unlimited variety of both the intended usages and the limitations imposed by the services. Still, Section 3.4.1 provides a good overview of the points one should pay special attention to.

Privacy is a crucial aspect when a connector deals with sensitive data. One should read the service’s ToS carefully, especially the statements about the ownership of user-provided data and its licensing for service provider. Otherwise, one could easily see their private data to be used in service provider’s advertisements.

- 100 - Service level agreement is another important aspect as there are significant differences between the services. Typically, this includes data security, durability and availability. Web 2.0 services provided free of charge usually give no such guarantees. However, if this is a concern, there are also Web 2.0 services that pledged to keep a high standard of provided service. For example Amazon states that its S3 service is designed to provide 99.999999999% durability of objects over a given year and to sustain the concurrent loss of data in two facilities. As a result, Amazon guarantees 99.9% service availability under financial penalties.

Not surprisingly, the service level agreement is closely related to the pricing model. Typically, several subscriptions with different cost are offered for a service, differentiated by service level, available functionality, volume limits (e.g. on data space) or any combination thereof. In addition, one can find services with fixed regular payments as well as services paid for on a pay-as-use basis. One should consider carefully, which of these pricing models suits the intended purposes best.

The service’s application interface should be assessed from several viewpoints. More often than not, one will face some API usage limitations. Such limitations are usually not an issue for single-user applications, such as clients running on various kinds of mobile devices. But for multi-user applications, it can effectively disqualify the particular service from the competition.

Another aspect affecting the practical usability of a particular service for implementing a connector is its API design. Not all Web 2.0 service APIs, we have met so far, are designed for the communication to be secure, seamless and effective. For example in Digitalbucket, a storage-oriented Web 2.0 service, one may need to go through many request-response interactions to convert a known file’s name and path to its system id, required for any further operation with the file. There are also APIs communicating and authenticating users using the plain unencrypted HTTP protocol (e.g. BibSonomy), which is a serious security issue for most serious applications. Authentication is another potential source of insoluble problems. The proposed framework is designed for APIs where user authentication does not require direct interaction with the user. For two good reasons – first, it would be extremely uncomfortable for users to authenticate with every single underlying Web 2.0 service. Second, it would be up to client applications to handle the different authentication processes of various Web 2.0 services properly, which is very demanding and completely out of the control of the framework itself. A typical example of an authentication scheme which is hardly implementable in the context of the Web 2.0 platform is OAuth (see Section 3.2.4 for details).

From my personal experience, it is not very common that one can find multiple Web 2.0 services, satisfying all the criteria just mentioned. More likely some compromises have to be accepted. But if it really happens, quality of API documentation or existence of a high-level library for accessing the API from the particular programming language can be taken as complementary criteria. These are not hard requirements, but may simplify and speed up the connector development significantly.

- 101 - 5. Framework in action – Case study

To evaluate the proposed framework in nearly realistic conditions, a case study was conducted. Results of the case study can of course not be overrated or even generalised, but still, it can reveal the strengths and weaknesses of the framework. The application domain of the case study was inspired by a related research as described in the following section.

5.1 Motivation

In 2007, Derntl and Motschnig (2007) introduced Inclusive universal access (IUA), an educational philosophy based on the technology-centric concept of Universal access, coined by Stephanidis and Savidis (2001), concerned with making IT products and services accessible to all potential users. IUA adds non-technological, human aspects that can contribute to facilitating social and personal growth of students in learning and knowledge-sharing settings. It aims to actively involve learners in all aspects of learning and assessment, to primarily address them on all levels of learning including intellect, skills, and personality, and to employ universally accessible tools to support the educational activities.

Soon, Pitner et al. (2007) showed that present Web 2.0 services can be used to support the IUA philosophy in real settings as they cover the necessary functionality. Several specific IUA scenarios were proposed, based on the repository of learning patterns for person-centered technology-enhanced learning (Derntl, 2005), and collections of Web 2.0 services suitable for supporting each scenario were identified. The Web 2.0 Platform allows us to get these ideas into practice. To create an integrated environment for IUA in technology-enhanced learning, implementing such scenarios and making use of Web 2.0 services.

5.2 Use-cases

Like in (Pitner et al., 2007), the selection of IUA scenarios to be implemented in the case study is based on the learning patterns repository compiled by Derntl (2005). The repository includes a total of 52 learning patterns of various kinds, each with a set of descriptive attributes assigned: • from purely abstract to very specific (e.g. Interactive element vs. Questionnaire, classified by the “Level of abstraction” pattern attribute); • from face-to-face to completely computer-based (e.g. Meeting vs. Online discussion, classified by the “Primary presence type” pattern attribute); • from long-running processes to short activities (e.g. Project-based learning course vs. Initial meeting, classified by the “Scope” pattern attribute); • both simple and composite, i.e. composed of other patterns.

These descriptive attributes facilitate categorization of the patterns, but they are not very helpful when implementing a supportive environment. Although patterns attributed with the “online” primary presence type are natural candidates for being supported by the platform, or IT in general, it is not a strict criterion. Despite IT not being their primary environment, patterns of “blended” or even “present” primary presence types may still benefit from the IT support (e.g. Team building, which has a blended primary presence type assigned). Fortunately, the repository itself denotes the patterns suitable for being supported by the IT directly. Twelve of the 52 learning patterns have a “web template” attached that specifies how the specific learning pattern could be supported in a web-based learning environment. Further investigation of those 12 patterns revealed that there are in fact seven basic or primitive learning

- 102 - patterns and the remaining five are just compositions or specific modifications of the basic ones. The seven identified primitive learning patterns with web-template specified are therefore the final candidates for being included in the case study. Namely: • Diary; • Market; • Online discussion; • Questionnaire; • Reaction sheets; • Team building; • Team workspaces.

In the end, the Questionnaire and Online discussion patterns were excluded because no suitable Web 2.0 service was found to support them. Not because such services do not exist at all. For example, one can easily find many services for running online surveys and/or polls that could potentially support the Questionnaire pattern (eSurveysPro, FormSite, FreeOnlineSurveys, PollDaddy, SurveyGizmo, SurveyMonkey, Survs, Zoomerang,…). However, from all these services, only three have a published API and only one allows creating new questionnaires through the API. Unfortunately, this only service (SurveyGizmo) is paid and provides no free subscription.

Similar issue had to be solved with the Team building pattern. Although many Web 2.0 services deal with user groups somehow, managing the group memberships is rarely included in the published API. Probably because it is not the primary concern of any Web 2.0 service. What is more, teams and their memberships are the basis of many IUA scenarios and the respective learning patterns. Finding all members of a given team or discovering all teams a given user is member of are therefore very frequent operations which have to be processed quickly not to impede the performance of other operations. For these reasons, support for the Team building pattern will be implemented locally instead of employing a remote service.

To sum up, the final set of the learning patterns to be implemented in the case study includes five patterns – Diary, Market, Reaction sheets, Team building and Team workspaces. The Team building pattern will be supported by a locally running code, whereas the support for the remaining patterns will be implemented using a set of selected Web 2.0 services.

5.3 Design and implementation

This section describes realization of individual learning patterns, selected in the previous section, using the proposed framework and selected Web 2.0 service APIs. A brief description and a sequence diagram are provided for each of the implemented learning patterns. Detailed description of each pattern can be found in the pattern repository (Derntl, 2005).

The overall design is influenced by the fact that the case study is conducted to evaluate the proposed concept, not to create a full-fledged learning environment. Connectors are therefore designed for being used in a single run of a single course and high-level organisational concepts such as a course, a semester or even a curriculum are missing. Their inclusion would bring additional complexity to connector APIs and client applications but it would not contribute to the evaluation of the proposed framework.

- 103 - 5.3.1 Team building

The IUA educational philosophy promotes collaborative teamwork as it more adequately resembles real- life situations in most today’s businesses. Building teams is therefore a necessary prerequisite for many subsequent learning activities. And the other way round, many learning activities rely on preestablished teams.Visual TheParadigm forTeam UML Standard building Edition(Masaryk University) pattern facilitates establishing team compositions by letting participants to join any of the existing teams, leave their current team or found a new one (see Figure 5.1).

Publish team building requirements Instructor

Create a team

Join a team

View teams and their

Participants memberships

Leave the current team

Figure 5.1: Sequence of the “Team building” learning pattern, adapted from (Derntl, 2005)

Interface of the respective connector includes both the aforementioned operations for modifying the current team settingsVisual Paradigmand the for UML querying Standard Edition(Masaryk operations University) (see Figure 5.2). For the sake of simplicity, just a single instance of this pattern is assumed, which makes the initiating activity of publishing the requirements not necessary. When needed, it could be realized using the Messaging connector.

<> TeamBuilding TeamInfo +createTeam(teamName : String) : String UserInfo -id : String +joinTeam(teamId : String) : void -login : String -name : String +leaveCurrentTeam() : void -name : String +getAllTeams() : TeamInfo [] +getTeamMembers(teamId : String) : UserInfo [] +getCurrentTeam() : TeamInfo

Figure 5.2: Interface of the TeamBuilding connector

As already justified, the TeamBuilding connector is implemented locally instead of employing a remote Web 2.0 service API, which makes it somewhat uninteresting for the purpose of this thesis and we will not further elaborate on it. Data is stored in a simple relational database.

5.3.2 Reaction sheets

Reaction sheets are a form of collecting feedback from participants. A feedback collector, typically an instructor, solicits reaction sheets and feedback providers, typically participants, are expected to respond by providing unstructured, written feedback. Subsequently, the feedback collector can review the collected feedback (see Figure 5.3).

- 104 - Visual Paradigm for UML Standard Edition(Masaryk University)

Solicit reaction sheets Review reaction sheets Collector

Provide reaction sheets Reaction sheets Providers

Figure 5.3: Sequence of the “Reaction sheets” learning pattern, adapted from (Derntl, 2005)

Interface of the respective connector provides operations necessary for both feedback collectors and feedbackVisual providers Paradigm for UML Standard(see Edition(MasarykFigure University)5.4). In contrast to the realization of the Team building pattern, multiple independent instances of this pattern can potentially coexist, each having a specific purpose and targeting specific users.

<> ReactionSheets ReactionSheetInfo +createRequest(description : String, targetUsers : String []) : String ResponseInfo -id : String +getAllRequests() : ReactionSheetInfo [] -author : String -description : String +getAllResponses(requestId : String) : ResponseInfo [] -content : String +getMissingResponders(requestId : String) : String [] +getPendingRequests() : ReactionSheetInfo [] +sendResponse(requestId : String, content : String) : void

Figure 5.4: Interface of the ReactionSheets connector

For implementing such functionality, a Web 2.0 service capable of running online questionnaires or surveys would come in handy. Unfortunately, as already mentioned in Section 5.2, we did not succeed in searching a Web 2.0 service, offering the necessary functionality through an API and providing free subscription at the same time. The ReactionSheets connector was therefore implemented in a not very suitable though possible way, on top of the FileStorage primary connector, backed by the Digitalbucket Web 2.0 service. A specific hierarchy of folders, created in a system account and shared with individual responders for reading or writing is used to enable all the operations and at the same time not to allow individual participants to break the expected structure of files and folders, representing the requests and responses.

5.3.3 Diary

According to IUA, making participants keep a diary during their learning activities initiates self-reflective thinking. In addition, it provides the instructor with an insight into the participants’ activities and allows monitoringVisual Paradigm for UML Standard the Edition(Masaryk distribution University) of activities among team members. After some initial steps, participants update their diaries periodically and instructor reviews the diaries (see Figure 5.5).

Publish diary Initialize diaries Empty diaries Review diaries requirements Instructor

Update diaries Updated diaries Participants

Figure 5.5: Sequence of the “Diary” learning pattern, adapted from (Derntl, 2005)

- 105 - The Diary pattern potentially covers both individual diaries, i.e. each participant has their own, and team diaries, shared by team members. In the end, realization of both variants is very similar, so just a team- based scenario was implemented in the case study, managing a single diary for each team. Interface of the respective connector is very simple as there are just two operations needed – one for listing the entries of the specified team’s diary and another one for adding a new entry to the specified diary. In fact, the diary need not be specified when adding a new entry, because our implementation of the Team building pattern does not allow Visualany Paradigm user for UMLto Standardbe a Edition(Masarykmember University) of more than one team at a time. Therefore, a user always works with their current team’s diary. If the Team building pattern was approached differently, team identification would be possibly needed. Just for convenience, the third method was added for listing the entries of a user’s current team’s diary (see Figure 5.6).

<> EntryInfo TeamDiaries -author : String +addEntry(date : Date, subject : String, content : String) : void -date : Date +getEntries() : EntryInfo [] -subject : String +getEntries(teamId : String) : EntryInfo [] -content : String

Figure 5.6: Interface of the TeamDiaries connector

Diary initialization is implemented in the connector too, although it is not part of its published interface. Instead of being called directly, it is triggered by a notification. Notifications allow us to synchronize diary settings with team settings easily – diaries are created, deleted and reconfigured together with their respective teams.

The connector is implemented on top of the Calendar primary connector, backed by the Google Calendar Web 2.0 service. Although the primary intention of calendar services is noting and reminding future events, they can be used for recording past ones as well. For each diary, a separate calendar is created in a system account and shared with team members for reading. This allows us to synchronize its sharing settings with team memberships. In addition, this allows participants to review or publish their team’s diary freely, but adding new entries has to be accomplished through the platform not to break the expected format of calendar entries.

5.3.4 Team workspaces

TeamVisual Paradigm workspaces for UML Standard Edition(Masaryk facilitate University) online collaboration among team members. Each team gets a dedicated, separate space for managing their documents. This includes creating, uploading, downloading, deleting, organizing, working on and sharing the documents. Optionally, peers’ workspaces can be accessible for viewing.

Create workspaces Workspaces Administrator

View peers' workspaces

Manage documents Documents Participants, instructor Participants,

Figure 5.7: Sequence of the “Team workspaces” learning pattern, adapted from (Derntl, 2005)

- 106 - For the purposes of the case study, interface of the TeamWorkspaces connector contains just a minimal set of operations,Visual Paradigm for UML Standard necessary Edition(Masaryk for University) managing a file-based storage by its users (see Figure 5.8). In real settings, other operations should be added for comfortable management of the storage – renaming, copying, moving, publishing, tagging etc. The overall conception of the TeamWorkspaces connector has many things in common with the TeamDiaries connector. Workspace settings are synchronized with team settings using notifications and the fact that a user cannot be a member of multiple teams at the same time allows us to do without explicit workspace identification.

<> TeamWorkspaces +createFile(parentFolderId : String, fileName : String, fileData : byte []) : String FolderContent FileInfo +deleteFile(fileId : String) : void -folders : FileInfo[] -id : String +getfile(fileId : String) : byte [] -files : FileInfo[] -name : String +createFolder(parentFolderId : String, folderName : String) : String +deleteFolder(folderId : String) : void +listFolder(folderId : String) : FolderContent

Figure 5.8: Interface of the TeamWorkspaces connector

The TeamWorkspaces connector is implemented on top of the FileStorage connector, backed by the Digitalbucket Web 2.0 service. Not surprisingly, interfaces of both connectors are very similar. They both deal with file storage management and the difference is purely logical – FileStorage manages file storage for a single platform user, whereas TeamWorkspaces manages file storage shared by a group of platform users. The root folder of each workspace is created in a system account and shared with team members for writing. This allows participants to manage their workspaces not only through the platform, but possibly also using the web-based user interface of the Digitalbucket service.

5.3.5 Market

Market is a simple generic scenario for exchanging and sharing information items among participants. Basically, it includes uploading, viewing and downloading the items (see Figure 5.9). Optionally, a VisualMarket Paradigm for UML could Standard Edition(Masarykprovide University) also support for requesting non-existent items. For better orientation, items can be organized into sections.

Create market space Administrator

Upload/update item Market item

<> contribute Download item

download no View shared items Participants

request Request item

Figure 5.9: Sequence of the “Market” learning pattern, adapted from (Derntl, 2005)

For the purposes of the case study, a single system-wide Market instance is assumed, divided into one or more sections. In accordance with the web template, suggested for the pattern, the collected information

- 107 - Visual Paradigm for UML Standard Edition(Masaryk University)

items are limited to uploaded documents. Interface of the respective connector covers uploading, downloading and browsing market items (see Figure 5.10).

<> Market +createSection(sectionName : String, description : String) : void MarketItemInfo SectionInfo +getSections() : SectionInfo [] -author : String -name : String +listSection(sectionName : String) : MarketItemInfo [] -name : String -description : String +putItem(sectionName : String, itemName : String, description : String, data : byte []) : void -description : String +getItem(sectionName : String, itemName : String) : byte []

Figure 5.10: Interface of the Market connector

The connector is implemented on top of the FileStorage connector, backed by the Digitalbucket Web 2.0 service. Uploaded files are stored and organized in a specific folder in a system account and shared with all participants for reading. Users can therefore browse and download market content not only through the platform, but also directly through the Digitalbucket’s user interface. On the other hand, uploading new documents can be done through the platform only to preserve the storage structure.

5.4 Overall schema

With the exception of the Team building pattern, all other learning patterns are implemented as secondary connectors, employing the functionality made available by a pre-established set of primary connectors. In fact, just two primary connectors were used at the end – FileStorage, backed by the Digitalbucket service, Visualand Paradigm Calendar, for UML Standard Edition(Masarykbacked University)by the Google Calendar service. Interfaces of these two connectors are shown in Figure 5.11, together with the overall schema of implementation dependencies between connectors.

<> <> <> <> <> TeamDiaries TeamBuilding TeamWorkspaces ReactionSheets Market

TeamDiariesImpl TeamBuildingImpl TeamWorkspacesImpl ReactionSheetsImpl MatketImpl

<> <> Calendar FileStorage +createCalendar(title : String, description : String) : String +createFile(parentFolderId : String, fileName : String, fileData : byte []) : String +deleteCalendar(calendarId : String) : void +deleteFile(fileId : String) : void +shareCalendar(calendarId : String, users : String [], mode : char) : void +getFile(fileId : String) : byte [] +unshareCalendar(calendarId : String, users : String []) : void +createFolder(parentFolderId : String, folderName : String) : String +getAllCalendars() : CalendarInfo [] +deleteFolder(folderId : String) : void +createEvent(calendarId : String, title : String, description : String, start : DateTime, ... +listFolder(folderId : String) : FolderContent +deleteEvent(eventId : String) : void +shareFolder(folderId : String, users : String [], mode : char) : void +getAllEvents(calendarId : String) : EventInfo [] +unshareFolder(folderId : String, users : String []) : void +rangeEvents(calendarId : String, start : DateTime, end : DateTime) : EventInfo [] +getSharedFolders() : FileInfo [] +queryEvents(calendarId : String, eventTitle : String) : EventInfo []

CalendarInfo EventInfo FolderContent FileInfo -id : String -id : String -folders : FileInfo[] -id : String -title : String -title : String -files : FileInfo[] -name : String -description : String -description : String -start : DateTime -end : DateTime

Figure 5.11: Implementation dependencies among connectors

- 108 - Although they were not eventually employed for realizing the selected learning patterns, three more primary connectors were implemented during experimenting with the framework and can be accessed from client applications: • Bookmarks, backed by the BibSonomy service, useful for managing and sharing web links; • Tasks, backed by the Toodledoo service, for assigning and managing tasks; • Messaging, for sending emails using the SMTP protocol.

5.5 Client application

To make sense, any learning environment needs a user interface. For the purpose of the case study, a web-based client application was developed, transforming user actions into platform invocations and presenting users with the data retrieved from the platform.

There are currently lots of technologies and frameworks for implementing web-based applications. But not every one of them is well suited for implementing a thin web-based user interface for a system, accessed through Web services. At the end of the day, the Java Server Faces (JSF) technology was selected for implementing the client application for two main reasons. First, because transforming user actions to invocations of Web services is easy in managed javabeans. Second, because there are dozens of AJAX-enabled graphical components instantly available in component libraries like PrimeFaces, ICEFaces or JBoss RichFaces.

The most important lesson learned during the development of the client application was that any client application should be designed with relatively slow platform response times in mind. In the services we used, response times over 1 second were not an exception (see Section 6.1 for specific numbers). Non-blocking asynchronous communication is therefore necessary for achieving an acceptable user experience. Specifically for web-based client applications, it means that AJAX should be used whenever possible.

- 109 - 6. Assessment of the framework

This chapter reviews the fundamental characteristics of the proposed framework. Inevitably, it has all the advantages and disadvantages inherent for any service-oriented architecture. But some features are also determined by the way the framework was designed and implemented:

Usability of the framework: • Any Web 2.0 service, whose API is accessible using the Java programming language, can be integrated with the framework with more or less programming effort. Given the universality of the Java programming language, this is the case of virtually any conceivable API. Platform adopters are therefore not limited in choosing the services with convenient functional and non- functional characteristics. • It facilitates integration of virtually any software services, whether they have a Web 2.0 flavour or not. Therefore, it can be used also for integrating Web 2.0 services with legacy applications. The only requirement for integrating legacy systems is that they have a published API or allow integration on some lower layer (e.g. by accessing remote objects or data storage). • Just a few requirements are imposed on client applications and the devices they run on, which allows accessing the functionality, provided by a platform instance from all sorts of devices .

Versatility of primary connectors: • Functionality offered by an external service can be modified or enhanced to some extent by the respective primary connector. But this should not be overused. Choosing another service or using a fully home-grown implementation may be preferable to implementing major modifications of the functionality offered by a particular external service. • Primary connectors need not necessarily wrap an external service. They can also answer incoming requests on their own, which can be of benefit especially for delivering a very specific or critical functionality. • Primary connectors can improve the performance of their operations by caching, although no specific support for it is provided by the framework.

Support for continuous development: • As long as the required functional base is covered by existing primary connectors, new functionality can be added by means of secondary connectors either programmatically, using Java or some JSR 223-enabled scripting language, or in a more declarative manner using BPEL. Both methods can benefit from an extensive support in current IDEs. • External services, used to deliver the functionality, can be replaced relatively easily as long as the interface of the respective primary connector is designed carefully, i.e. to be application- neutral. This is very important, because Web 2.0 services are not everlasting and we have already witnessed shutdowns of some of them. For example Xdrive, a storage service run by AOL, was closed in 2008 after almost one decade of its existence. When switching an underlying service, the framework equipped with connectors for both the services can even help with migrating the data.

Deployment and runtime characteristics of the framework: • New connectors can be added to a platform instance at runtime. • Robustness, reliability and performance of the framework itself are easily addressable as it takes advantage of a high standard of enterprise technologies, tools and environments.

- 110 - Limitations of the framework: • The framework and especially the way it communicates with user-facing client applications are not well suited for highly interactive applications (e.g. instant messaging). • The framework is completely API-oriented and provides neither any user interface nor support for creating it. This is left up to client applications completely, which has some undesirable consequences. For example, providing users’ credentials to the platform through client applications is a potential source of security issues.

6.1 Benchmarking

This chapter deals with measuring the performance of the proposed framework and deciding on satisfying the P1 requirement imposed on it – 30 requests per second on a commodity desktop PC. Unfortunately, this is not a straightforward task. Firstly because of the vague specification of the hardware configuration to be used and secondly because there are multiple parties involved in processing any operation, realized using the proposed framework (see Figures 4.7 and 4.9). If we consider a typical synchronous operation involving interaction with a remote Web 2.0 service API and omit optional actions such as notifications or retrieving user’s credentials from the platform core, the following activities have to be undertaken by the respective parties: • client application: »» creating and sending out a request; »» receiving and processing a response. • platform core: »» accepting the request from the client application and routing it to the appropriate connector; »» accepting the response from the connector and sending it back to the client application. • connector: »» accepting the request from the platform core; »» processing the request by interacting with an underlying Web 2.0 service through its API; »» formulating an answer and sending it back to the platform core. • Web 2.0 service: »» accepting, processing and answering a request sent by connector. • network: »» transporting requests and responses between the above mentioned parties, possibly with an exception of the communication between the platform core and the connector, which can be realized in a more effective, native way.

The overall processing time of a typical operation is therefore determined by the sum of processing times of all these activities. For measuring the performance of the framework itself, we need to eliminate external influences, especially processing in client applications, processing in connectors, processing in remote Web 2.0 services and qualities of the network connection. At the same time, the testing scenario should not omit any activity involved in real operation invocations to stay as close to real operation invocations as possible. Taking these criteria into account, two testing scenarios were developed: • First, where a special connector does no request processing and answers all incoming requests with a short fixed response immediately. • Second, where another special connector still does no request processing, but waits for a simple HTTP request to a remote server to be answered before sending back a short fixed response. The HTTP server runs on a different machine, interconnected by a 100Mbps ethernet connection.

The first scenario simulates situation, when a connector is able to answer the request on its own. Either because its cached data are sufficient for answering or because it implements the functionality on

- 111 - its own. The second scenario extends the first one by communication with an ideal remote service, connected through a very fast network and responding very quickly. In both cases, a client application was simulated by Apache JMeter, a specialised open-source tool for load testing and measuring the performance of web applications, running on the same machine as the platform not to introduce any other network communication. It was configured to send short fixed requests targeting the appropriate connector to the platform and measure the time elapsed between sending a request and receiving a response. Thirty threads were used to simulate the load originated by multiple concurrent users, each generating a continuous sequence of 50 requests. Raising any of these numbers did not change the results.

To deal with the “commodity desktop PC” assumption correctly, the platform was installed on three desktop PCs with different hardware characteristics and the measurements were performed ten times for each of them. The overall results, presented in Table 6.1, are calculated as a truncated arithmetic mean of all the results measured on the particular configuration (two highest and two lowest results were discarded).

Throughput in first Throughput in second Configuration of the PC, running the platform scenario (requests/s) scenario (requests/s) Intel Pentium 4 3GHz (single core), 2GB DDR, Win 7 37 22 AMD Athlon II X2 240 2.8GHz (dual core), 1.75GB DDR2, Win XP 99 36 AMD Athlon II X3 455 3.3GHz (triple core), 4GB DDR2, Win 7 x64 193 40

Table 6.1: Attainable throughput of the framework on various hardware configurations

Results obtained using the first testing scenario reveal that the performance of the framework itself is sufficient for the purpose it was designed for, i.e. for serving dozens of concurrent users, without any special hardware requirements or performance tuning. Even a six years old PC with a single core microprocessor is able to beat the limit of serving 30 requests per second. The second testing scenario clearly shows that using even a very fast remote HTTP-based service affects the overall throughput seriously. This of course gets even worse when a connector communicates with a real remote Web 2.0 service API through the Internet. To illustrate this, the performance of three Web 2.0 service APIs was monitored in two platform instances with fast, low-latency Internet connection over the period of one month. The results, summarized in Table 6.2, definitely cannot be generalized or applied to other service APIs, but still, one can get a rough idea of what performance level can be expected when using Web 2.0 service APIs.

Average Average Service Operation Service Operation duration (ms) duration (ms) BibSonomy CreatePost 1055 Digitalbucket CreateFolder 425 GetPosts 644 GetFolder 535 DeletePost 523 GetRootFolder 681 Google CreateCalendar 6760 ShareFolder 498 Calendar GetAllCalendars 1704 DeleteFolder 686 DeleteCalendar 1156 PutFile 1096 CreateEvent 1816 GetFile 779 GetAllEvents 510 DeleteFile 898 QueryEvents 608 AddPermission 458 DeleteEvent 610 RemovePermission 388

Table 6.2: Performance of selected Web 2.0 service APIs

- 112 - 7. Conclusions

The research question driving this thesis was whether it is feasible to apply enterprise-level SOA-related technologies such as SOAP, ESB and BPEL for Web 2.0 service integration and, if so, what are the potential benefits and limitations of taking this way.

To get a theoretical background for answering the research question, the fields of service orientation, SOA and Web 2.0 were studied thoroughly. The theoretical statements regarding Web 2.0 principles were further supplemented with the findings obtained by analysing real Web 2.0 services. In answering the research question, the main objective of this thesis was to propose and evaluate a service-integration framework, allowing us to evaluate the approach in question. Such a framework was designed, implemented using SOA-related technologies, used in a case study and assessed. This entitles us to answer the research question competently now.

Most importantly, but not very surprisingly, answer to the first part of the research question is YES. There are no unsolvable principal or technological problems involved in connecting Web 2.0 services to an ESB using their APIs. And once they are connected, they can be treated as any other service available on the bus.

Answer to the second part of the research question is a more complicated, but probably also more interesting outcome of this thesis. Some of the benefits are just generalizations of the benefits of the proposed framework as enumerated in the previous chapter, because the framework allows to exploit their potential. In contrast, the list of the limitations has nothing in common with the limitations of the framework. The general limitations will, of course, constrain any application built upon the framework, but this can not be considered a limitation of the framework itself.

Potential benefits of using SOA-related enterprise technologies for integrating Web 2.0 services on the API level: • It allows modifying or enhancing the functionality offered by a Web 2.0 service. • It allows running whole series of service operations on behalf of user. • It allows combining the data stored in multiple Web 2.0 services to create a new value. • It allows orchestrating the functionality offered by multiple Web 2.0 services to create a new value. • It allows using Web 2.0 services for providing data or functionality in enterprise software environments. • As a special case worth noting, it allows Web 2.0 services to take part in (possibly long running) business processes. • Enterprise technologies, tools and environments typically provide a high level of robustness, reliability and performance.

Design issues, one has to deal with when using the approach in question: • Technology is not an issue for integrating Web 2.0 services this way. But limitations often stem from the services’ Terms of service, which may or may not allow using the particular service for the intended purpose and in the intended way. • There is no way to assure true transactionality, namely the atomicity, consistency and isolation properties. Without the support of all parties, participating in a distributed transaction, the best achievable solution for simulating transactionality is using compensating actions, undoing the original actions when necessary.

- 113 - • One has to count with the fact that users can bypass the service integration platform and access the data through the underlying service GUI. This makes especially restrictive actions harder to realize as they have to be ensured directly through the underlying service, not as an extension provided by the platform. • Graphical user interface, a key aspect of many Web 2.0 services, is not utilised. User-facing parts of the application, if there are any, have to provide their own. However, there are also Web 2.0 services, that do not have user interface at all, and what is more – nothing prevents users from using the services’ web interfaces too when beneficial.

Runtime issues, common for the approach in question: • Uncertain reliability. Reliability of any service-based system is inevitably limited by the reliability of participating services and their interconnections. In our case, the overall reliability depends on the reliability of the Web 2.0 services and on the reliability of the Internet connection. Although these aspects are in most cases out of our direct control, redundant Internet connections and Web 2.0 services with guaranteed QoS can be employed to tackle this issue. • Uncertain performance. Such an observation definitely cannot be generalized, but the services used in the case study provided no explicit QoS obligations and the performance of their APIs was not good. This is of importance especially for the total processing time of high-level operations, whose realization includes orchestrated invocations of multiple API calls. What is more, not all Web 2.0 service APIs are designed for the communication to be effective and many fine-grained API calls may be necessary even for realizing straightforward operations, which affects the performance of such operations seriously.

Limitations of the approach in question for end users: • End users have to create accounts in all Web 2.0 services in use and provide their credentials to the platform, allowing it to act on behalf of them. This can be mitigated to some extent by using services being run by the same provider, which can be often accessed using the same credentials, and possibly also by using services, which allow sharing one account by multiple users. As an extreme case, a single enterprise-wide service account could be shared by all platform users, but this would seriously limit the potential benefits of using the service for individual users. • The impression of being a member of a community, a key feature of many Web 2.0 services, is suppressed. However, the community-related functionality, such as content searching based on users’ tags can still be available and users can still access the services through their web interfaces if they want to.

To conclude, for the limitations described here, most notably the lack of transactionality and uncertain performance and reliability, current Web 2.0 services and their APIs cannot be generally recommended to be used as a part of the basic infrastructure of an enterprise system. Rather, they can complement the basic infrastructure to extend its functionality. But every rule except this one has an exception and specific enterprise-targeted Web 2.0 services exist, recommendable even for critical parts of an enterprise system. Typical examples of such reliable services are payment services and infrastructural services.

This thesis examined integration of Web 2.0 services using their APIs. At the same time, enterprise mashups can be used for integrating services on the presentation layer. Both these approaches definitely make sense and can provide value, each being suitable for specific purposes and conditions, each having its specific benefits and limitations. Probably the most challenging area where the work described in this thesis could be taken forward is in seamless combination and cooperation of these two approaches to get the best of both.

- 114 - Bibliography

Adobe Systems, Inc. 2010. Flash Player penetration [online], March 2010 [cit. 2010-06-02]. Available from . ALLEN, Paul. 2006. Service orientation: winning strategies and best practices. Cambridge: Cambridge University Press, 2006. 336p. ISBN 0-521-84336-7. ANDERSON, Paul. 2007. What is Web 2.0? Ideas, technologies and implications for education [online]. Bristol: JISC, February 2007 [cit. 2010-05-31]. 64p. JISC Technology and Standards Watch report. Available from . ARLOW, Jim – NEUSTADT, Ila. 2005. UML 2 and the Unified Process: Practical Object- Oriented Analysis and Design. 2nd ed. Upper Saddle River: Addison-Wesley, 2005. 624p. ISBN 0-321‑32127‑8. ARSANJANI, Ali. 2004. Service-oriented modeling and architecture: How to identify, specify, and realize services for your SOA [online]. IBM: IBM developerWorks, 2004‑11‑09 [cit. 2010-05-10]. Available from . BAKKER, Loek. 2005. Goodbye Hub-and-Spoke, Hello ESB?: Integration Architecture With BizTalk 2004. In .NET Developer’s Journal [online]. SYS-CON Media, 2005-09-12 [cit. 2010-04-12]. Available from . BARROS, Alistair – DUMAS, Marlon – OAKS, Phillipa. 2006. Standards for Web Service Choreography and Orchestration: Status and Perspectives. In BUSSLER, Christoph and HALLER, Armin (eds.). Business Process Management Workshops. Berlin: Springer, 2006, p. 61–74. Lecture notes in computer science, 3812/2006, ISSN 0302-9743. ISBN 978-3-540-32595-6. BEISEIGEL, Michael – BOOZ, Dave – EDWARDS, Mike et al. 2007. Software components: Coarse- grained versus fine-grained [online]. IBM: IBM developerWorks, 2007‑12‑06 [cit. 2010-03-15]. Available from . BELL, Michael. 2008. Service-Oriented Modeling: Service Analysis, Design, and Architecture. USA: Wiley, 2008. 384p. ISBN 978-0-470-14111-3. BERNERS-LEE, Tim – CAILLAU, Robert. 1990. WorldWideWeb: Proposal for a HyperText Project [online], November 1990 [cit. 2009-09-30]. Available from . BERNSTEIN, Philip A. – HADZILACOS, Vassos – GOODMAN, Nathan. 1987. Concurrency control and recovery in database systems. Reading: Addison-Wesley, 1987. 370p. ISBN 0-201-10715-5. BHIRI, Sami – GAALOUL, Walid – ROUACHED, Mohsen et al. 2009. Semantic Web Services for Satisfying SOA Requirements. In DILLON, Tharam S. et al. (eds.). Advances in Web Semantics I: Ontologies, Web Services and Applied Semantic Web. Berlin: Springer, 2009, p. 374–395. Lecture notes in computer science, 4891/2009, ISSN 0302-9743. ISBN 978-3-540-89783-5. BOOCH, Grady. 2004. IBM’s Grady Booch on solving complexity: Big Blue Fellow talks about grid computing, the future of middleware, and working with IBM Research. InfoWorld [online]. 2004 [cit. 2010-03-01]. Available from . ISSN 0199-6649.

- 115 - BPMI. 2004. Business Process Modeling Notation (BPMN) [online]. Version 1.0. 2004‑05‑03 [cit. 2010‑05‑12]. 296p. Available from . BRADLEY, Neil. 2006. The pros and cons of XML: When and how should organisations adopt XML as a document storage and publishing format? Enterprise Information magazine, 2006, vol. 2, no. 10. ISSN 1746-5362. BRAY, Tim. 2006. OSCON – Open Data [online]. 2006-07-28 [cit. 2010-01-11]. Available from . BRODSKY, Stephen A. 2006. SDO: Service Data Objects: An Overview [online]. SDO Collaboration, 2006 [cit. 2010-04-08]. Available from . BROWN, Alan – JOHNSTON, Simon – KELLY, Kevin. 2002. Using service-oriented architecture and component-based development to build web service applications [online]. 2002 [cit. 2011-01‑06]. Rational Software. 16p. Available from . BUGHIN, Jacques – CHUI, Michael. 2010. The rise of the networked enterprise: Web 2.0 finds its payday [online]. McKinsey: McKinsey Quarterly, December 2010 [cit. 2011-01-17]. Available from . BUSCHMANN, Frank – MEUNIER, Regine – ROHNERT, Hans et al. 1996. Pattern-Oriented Software Architecture Volume 1: A System of Patterns. New York: Wiley, 1996. 476p. ISBN 0-471-95869-7. BUSCHMANN, Frank – HENNEY, Kevlin – SCHMIDT, Douglas C. 2007. Pattern-Oriented Software Architecture Volume 4: A Pattern Language for Distributed Computing. Chichester: Wiley, 2007. 636p. ISBN 978-0-470-05902-9. BUXTON, John N. – RANDELL, Brian. 1969. Software Engineering Techniques: Report of a conference sponsored by the NATO Science Committee, Rome, Italy, 27–31 Oct. 1969, Brussels: NATO Science Committee. 1970. 164p. CARDOSO, Jorge – VOIGT, Konrad – WINKLER, Matthias. 2009. Service Engineering for the Internet of Services. In FILIPE, Joaquim and CORDEIRO, José (eds.). Enterprise Information Systems, 10th International Conference ICEIS 2008, Revised Selected Papers. Berlin: Springer, 2009, p. 15–27. Lecture Notes in Business Information, vol. 19, ISSN 1865-1348. ISBN 978‑3‑642-00669-2. CASTLE, Bryan. 2005. Introduction to Web Services for Remote Portlets: Use WSRP in a Service- Oriented Architecture [online]. IBM: IBM developerWorks, 2005‑04‑15 [cit. 2010-05-05]. Available from . CHAPPELL, David A. 2004. Enterprise Service Bus: Theory in Practice. Sebastopol: O’Reilly Media, 2004. 352p. ISBN 978‑0‑596‑00675-4. CHEESMAN, John – DANIELS, John. 2001. UML Components: A Simple Process for Specifying Component-Based Software. Boston: Addison Wesley, 2001. 208p. ISBN 0-201-70851-5. CLEMENTS, Paul – BACHMANN, Felix – BASS, Len et al. 2002. Documenting Software Architectures: Views and Beyond. Boston: Addison-Wesley, 2002. 560p. ISBN 0-201-70372-6. Creative Commons. 2007. Creative Commons [online]. 2007 [cit. 2011-01-03]. Creative Commons Licenses. Available from .

- 116 - CUMMINS, Fred A. 2002. Enterprise Integration: An Architecture for Enterprise Application and Systems Integration. USA: Wiley, 2002. 496p. ISBN 0-471-40010-6. Dante Consulting, Inc. 2005. SOA, Web Services and Enterprise Application Integration [online]. June 2005 [cit. 2010-04-14]. 5p. Available from . DeMARCO, Tom. 1979. Structured Analysis and System Specification. Upper Sadle River: Prentice Hall, 1979. 352p. ISBN 0-13-854380-1. DERNTL, Michael. 2005. Patterns for Person-Centered e-Learning: Dissertation. Wien: Universität Wien, 2005. 493p. Available from . DERNTL, Michael – MOTSCHNIG-PITRIK, Renate. 2007. Inclusive Universal Access in Engineering Education. In Proceedings of the 37th ASEE/IEEE Frontiers in Education Conference. Milwaukee: IEEE, 2007. p.F4C-1–F4C-6. ISBN 978-1-4244-1084-3. DiNUCCI, Darcy. 1999. Fragmented Future. Print, 1999, vol. 53, no. 4. Available from . DRÁŠIL, Pavel – PITNER, Tomáš – HAMPEL, Thorsten et al. 2008. Get ready for mashability!: Concepts for Web 2.0 service integration. In Proceedings of the 10th International Conference on Enterprise Information Systems, Vol. SAIC. Barcelona: INSTICC Press, 2008. p.160–167. ISBN 978-989-8111-39-5. DRÁŠIL, Pavel – PITNER, Tomáš. 2009. Building complex systems on top of Web 2.0: Integration of Web 2.0 services using Enterprise Service Bus. In Proceedings of the 4th International Conference on Software and Data Technologies, Vol. 2. Sofia: INSTICC Press, 2009. p.179–182. ISBN 78‑989‑674-010-8. ELFATATRY, Ahmed. 2007. Dealing with Change: Components versus Services. Communications of the ACM, August 2007, vol. 50, no. 8, p. 35–39. ISSN 0001-0782. ELSSAMADISY, Amr. 2007. SOA and Agile: Friends or Foes? [online]. InfoQ, 2007-04-14 [cit. 2011‑01‑08]. Available from . ERL, Thomas. 2005. Service-Oriented Architecture (SOA): Concepts, Technology, and Design. Upper Saddle River: Prentice Hall, 2005. 792p. ISBN 0-13-185858-0. ERL, Thomas. 2007. SOA: principles of service design. Upper Saddle River: Prentice Hall, 2007. 608p. ISBN 0-13-234482-3. ERL, Thomas. 2009. SOA Design Patterns. Upper Saddle River: Prentice Hall, 2009. 800p. ISBN 0-13‑613516‑1. FARLEY, Jim. 1998. Java Distributed Computing. USA: O’Reilly Media, 1998. 392p. ISBN 1-56592‑206‑9. FEUERLICHT, George – LOZINA, Josip. 2007. Understanding Service Reusability. In Proceedings of the 15th International Conference on Systems Integration. Praha: VŠE, June 2007. p. 144–150. ISBN 978-80-245-1196-2. FIELDING, Roy Thomas. 2000. Architectural Styles and the Design of Network-based Software Architectures: Dissertation. Irvine: University of California, 2000. 162p. Available from .

- 117 - First Monday. 2008. Vol. 13, no. 3 (March 2008). Special issue: Critical Perspectives on Web 2.0, edited by Michael Zimmer. ISSN 1396-0466. FOWLER, Martin. 2003. Who Needs an Architect?. IEEE Software, September–October 2003, vol. 20, no. 5, p.11–13. ISSN 0740-7459. GALL, Nicholas – SHOLLER, Daniel – BRADLEY, Anthony. 2008. Tutorial: Web-Oriented Architecture: Putting the Web Back in Web Services, ID Number G00162022, November 2008. Gartner Research. 13p. GARLAN, David – SHAW, Mary. 1993. Advances in Software Engineering and Knowledge Engineering. Vol 2. AMBRIOLA, V. and TORTORA, G. (eds.). New Jersey: World Scientific Publishing, 1993. ISBN 981‑02‑1594-0. An Introduction to Software Architecture, p. 1–39. Gartner, Inc. 2008. Gartner Says the Number of Organizations Planning to Adopt SOA for the First Time Is Falling Dramatically [online]. November 2008 [cit. 2009-10-02]. Available from Gartner, Inc. 2009. Gartner Says SOA Is Evolving Beyond Its Traditional Roots [online]. April 2009 [cit. 2009‑10-02]. Available from . GEELAN, Jeremy. i-Technology Predictions for 2007: Where’s It All Headed?. In Web 2.0 Journal [online]. SYS-CON Media, 2007-01-01 [cit. 2011-01-13]. Available from . Google, Inc. 2010. Google Data Protocol [online]. c2010 [cit. 2010-06-14]. Available from . Google, Inc. 2011. Google Maps API Family [online]. c2011 [cit. 2011-01-15]. Available from . GORTON, Ian. 2006. Essential Software Architecture. Berlin: Springer, 2006. 283p. ISBN 3-540‑28713‑2. GRADY, Robert B. 1992. Practical software metrics for project management and process improvement. Engelwood Cliffs, NJ: Prentice-Hall, 1992. 282p. ISBN 0-13-720384-5. GREIF, Irene – SARIN, Sunil. 1987. Data sharing in group work. ACM Transactions on Information Systems (TOIS), April 1987, vol. 5, no. 2, p.187–211. ISSN 1046-8188. HINCHCLIFFE, Dion. 2005. Is Web 2.0 The Global SOA? [online]. October 2005 [cit. 2009-09-30]. Available from . HINCHCLIFFE, Dion. 2009a. Using standards to drive success with enterprise mashups [online]. May 2009 [cit. 2011-01-14]. White paper, sponsored by IBM. Available from . HINCHCLIFFE, Dion. 2009b. Unboxing Web-Oriented Architecture: The 6 Aspects Of An Emergent Architectural Style [online]. 2009-06-06 [cit. 2010-06-03]. Available from . HOHPE, Gregor – WOOLF, Bobby. 2004. Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions. Boston: Addison-Wesley, 2004. 736p. ISBN 0321200683. HOYER, Volker – FISCHER, Marco. 2008a. Market Overview of Enterprise Mashup Tools. In Proceedings of the 6th International Conference on Service-Oriented Computing. Berlin: Springer, 2008, p.708–721. Lecture notes in computer science, vol. 5364, ISSN 0302-9743. ISBN 978‑3‑540‑89647‑0.

- 118 - HOYER, Volker – STANOESVKA-SLABEVA, Katarina – JANNER, Till et al. 2008b. Enterprise Mashups: Design Principles towards the Long Tail of User Needs. In Proceedings of 2008 IEEE International Conference on Services Computing, Vol. 2. Honolulu: IEEE Computer Society, 2008. p.601–602. ISBN 978-0-7695-3283-7. HUBERT, Richard. 2001. Convergent Architecture: Building Model Driven J2EE Systems with UML. USA: Wiley, 2001. 304p. ISBN 0-471-10560-0. IBM Corp. and BEA Systems. 2003. Service Data Objects [online]. BEATTY, John – BRODSKY, Stephen – ELLERSICK, Raymond et al. Version 1.0, November 2003 [cit. 2010-04-08]. Available from . IEEE Std 1471-2000. IEEE Recommended Practice for Architectural Description of Software-Intensive Systems, New York, USA: IEEE, 2000. 29p. ISBN 0-7381-2518-0. innoQ. 2007. Web Services Standards Overview [online]. Version 3.0, February 2007 [cit. 2010-04-08]. Available from . ISO/IEC 15948:2004. Information technology — Computer graphics and image processing — Portable Network Graphics (PNG): Functional specification. Geneva: ISO, 2004. 80p. JANNER, Till – SIEBECK, Robert – SCHROTH, Christoph et al. 2009. Patterns for Enterprise Mashups in B2B Collaborations to foster Lightweight Composition and End User Development. In Proceedings of the 2009 IEEE International Conference on Web Services. IEEE Computer society, 2009, p.976–983. ISBN 978‑0‑7695-3709‑2. JOHANNESSON, Paul – PERJONS, Erik. 2000. Design principles of application integration. In WANGLER, B. and BERGMAN, L. (eds.). Proceedings of the 12th International Conference on Advanced Information Systems Engineering. Berlin: Springer, 2000, p.212–231. Lecture notes in computer science, 1789/2000, ISSN 0302-9743. ISBN 3-540-67630-9. JOHNSON, Bobbie. 2008. Cloud computing is a trap, warns GNU founder Richard Stallman. The Guardian [online]. 2008-09-29 [cit. 2011-01-12]. Available from . JSON-RPC 1.0 Specification [online]. 2005 [cit. 2010-03-03]. Available from . JSR 168. 2003. Java Portlet Specification. Version 1.0. October 2003. Available from . JSR 208. 2005. Java Business Integration (JBI) 1.0. 2005. Available from . JSR 223. 2006. Scripting for the Java Platform. Version 1.0, 2006. Available from . JSR 286. 2008. Java Portlet Specification. Version 2.0, January 2008. Available from . JSR 311. 2009. Java API for RESTful Web Services. Maintenance release, November 2009. Available from .

- 119 - JU, Jungmin. 2006. State-of-the-art of Standards in Business Process Modeling and Execution. Pohang, 2006. M.Sc. thesis. Pohang University of Science and Technology, Division of Mechanical and Industrial Engineering. Available from .‌ KOGUT, Paul – CLEMENTS, Paul. 1994. The software architecture renaissance [online]. The Software Engineering Institute, Carnegie Mellon University. November 1994 [cit. 2011-01-05]. Available from . KOTLER, Philip. 1988. Marketing management: Analysis, planning, implementation and control, 6th edition. Engelwood Cliffs, NJ: Prentice Hall, 1988. 776p. ISBN 0135562678. KRAKOWIAK, Sacha. 2009. Middleware Architecture with Patterns and Frameworks [online]. Updated 2009‑02‑27 [cit. 2010‑03‑09]. 427p. Available from . KRAUNELIS, Leo – SCHMELZER, Ronald – BLOOMBERG, Jason. 2002. Using Web Services for Integration: A Darwin Partners and ZapThink Insight [online]. 2002 [cit. 2010-04-14]. 4p. Available from . KRÁL, Jaroslav – ŽEMLIČKA, Michal. 2000. System integration with autonomous components. In POUR, J. and VAŠÍČEK, J. (eds.). Proceedings of the conference Systems Integration, Prague, June 12-13, 2000. Prague University of Economics, Prague, 2000. p.499–506. ISBN 80-245‑0041‑8. KRÁL, Jaroslav – ŽEMLIČKA, Michal. 2005. Architecture, specification and design of service-oriented systems. In STOJANOVIC, Z. and DAHANAYAKE, A. (eds.). Service-Oriented Software System Engineering: Challenges and Practices. Idea Group Publishing, Hershey, USA, 2005. p.182–200. ISBN 1-59140-426-6. KRILL, Paul. 2005. Microsoft, IBM, SAP discontinue UDDI registry effort. Infoworld [online]. 2005‑12‑16 [cit. 2010-04-06]. Available from . KRUCHTEN, Philippe. 1995. The 4+1 View Model of Architecture. IEEE Software, 1995, vol. 12, no. 6, p.42–50. ISSN 0740-7459. KRUCHTEN, Philippe – OBBINK, Henk – STAFFORD, Judith. 2006. The Past, Present, and Future of Software Architecture. IEEE Software, March–April 2006, vol. 23, no. 2, p.22–30. ISSN 0740‑7459. LANINGHAM, Scott. 2006. developerWorks Interviews: Tim Berners-Lee [online]. IBM: IBM developerWorks, 2006-08-22 [cit. 2010-05-24]. Available from . LAWSON, Loraine. 2008. Why WOA vs. SOA Doesn’t Matter [online]. IT Business Edge, 2008-09‑10 [cit. 2010‑06‑02]. Interview with Nicholas Gall. Available from . LENNON, Joe. 2009. Implementing Enterprise 2.0 [online]. IBM: IBM developerWorks, 2009‑02‑17 [cit. 2011-01-14]. Available from . LINFO. 2004. The Linux Information Project [online]. 2004, last modif. 2006-08-23 [cit. 2010-03-15]. Pipes: A Brief Introduction. Available from . LINTHICUM, David S. 2000. Enterprise Application Integration. Reading: Addision-Wesley, 2000. 400p. ISBN 0-201-61583-5.

- 120 - LUBLINSKY, Boris. 2007. Defining SOA as an architectural style: Align your business model with technology [online]. IBM: IBM developerWorks, 2007-01-09 [cit. 2010-04-14]. Available from . MAJITHIA, Shalil – SHIELDS, Matthew – TAYLOR, Ian et al. 2004. Triana: A Graphical Web Service Composition and Execution Toolkit. In Proceedings of IEEE International Conference on Web Services (ICWS’04). San Diego: IEEE Computer Society, 2004. p.514–521. ISBN 0-7695-2167-3. McAFFEE, Andrew P. 2006. Enterprise 2.0: The Dawn of Emergent Collaboration. MIT Sloan Management Review, Spring 2006, vol. 47, no. 3, p.21–28. ISSN 1532-9194. McAffee. 2010. Web 2.0: A Complex Balancing Act [online]. c2010. Available from . McILROY, Douglas M. 1968. Mass produced software components. In NAUR, P. and RANDELL, B. (eds.). Proceedings of the 1st International Conference on Software Engineering. Garmisch: NATO Science Committee, 1968. p.138–155. McKENDRICK, Joe. 2005. Here’s an SOA definition we can live with [online]. December 2005 [cit. 2010‑03‑22]. Available from . MENDLING, Jan – HAFNER, Michael. 2008. From WS-CDL Choreography to BPEL Process Orchestration. Journal of Enterprise Information Management, 2008, vol. 21, no. 5, p.525–542. ISSN 1741-0398. MERRILL, Duane. 2006. Mashups: The new breed of Web app [online]. IBM: IBM developerWorks, 2006‑08‑08 [cit. 2010-06-02]. Available from . MITRA, Tilak. 2005. Business-driven development [online]. IBM: IBM developerWorks, 2005-12‑09 [cit. 2011-01-08]. Available from . MORGENTHAL, J.P. – LA FORGE, Bill. 2001. Enterprise application integration with XML and Java. Upper Saddle River: Prentice Hall, 2001. 504p. ISBN 0-13-085135-3. MUSSER, John – O’REILLY, Tim. 2006. Web 2.0 Principles and Best Practices. Sebastopol: O’Reilly Media, 2006. 101p. O’Reilly Radar Report. ISBN 0-596-52769-1. MUSSER, John. 2007. 12 Ways to limit an API [online]. 2007-04-02 [cit. 2010-12-23]. Available from . NATIS, Yefim V. 2003. Service-Oriented Architecture Scenario, ID Number AV-19-6751, April 2003. Gartner Research. 6p. NIELSEN, Jakob. 2009. Social Networking on Intranets. Alertbox, 2009-08-03. ISSN 1548-5552. NOVAK, Jasminko – VOIGT, Benjamin J.J. 2007. Mashups: Strukturelle Eigenschaften und Herausforderungen von End-User Development im Web 2.0. i-com, May 2007, vol. 6, no. 1, p.19–24. ISSN 1618‑162X. OASIS. 2003. Web Services for Remote Portlets Specification [online]. Version 1.0. August 2003 [cit. 2010‑05‑03]. 86p. Available from . OASIS. 2004. UDDI Version 3.0.2 [online]. October 2004 [cit. 2010-04-06]. Available from .

- 121 - OASIS. 2006. Reference Model for Service Oriented Architecture 1.0 [online]. October 2006 [cit. 2010‑03-03]. 31p. Available from . OASIS. 2007. Web Services Business Process Execution Language Version 2.0 [online]. April 2007 [cit. 2011‑01‑08]. 246p. Available from . OGC. 2008. KML 2.2.0 [online]. April 2008 [cit. 2010-06-14]. 251p. Available from .‌ OMG. 2009. Service oriented architecture Modeling Language (SoaML) – Specification for the UML Profile and Metamodel for Services (UPMS) [online]. Version 1.0 - Beta 2. 2009-12-10 [cit. 2010‑05‑12]. 169p. Available from . Open SOA Collaboration. 2007. Service Component Architecture: Assembly Model Specification. Version 1.0. March 2007 [cit. 2010-04-29]. 91p. Available from . OpenID Foundation. 2007. OpenID Authentication 2.0 Final [online]. 2007-12-05 [cit. 2011-01-03]. Available from . Oracle. 2010. Java SE Documentation [online]. c2010 [cit. 2010-12-06]. Java™ Authentication and Authorization Service (JAAS) Reference Guide. Available from . O’REILLY, Tim. 2005. What is Web 2.0: Design Patterns and Business Models for the Next Generation of Software [online]. September 2005 [cit. 2009-09-30]. Available from . O’REILLY, Tim. 2006. Web 2.0 Compact Definition: Trying Again [online]. December 2006 [cit. 2010‑05‑24]. Available from . ORT, Ed – MEHTA, Bhakti. 2003. Java Architecture for XML Binding (JAXB) [online]. Oracle: Sun Developer Network, March 2003 [cit. 2010-04-07]. Available from . ORT, Ed – BRYDON, Sean – BASLER, Mark. 2007. Mashup Styles, Part 1: Server-Side Mashups [online].Oracle: Sun Developer Network, May 2007 [cit. 2010-06-02]. Available from . PAPAZOGLOU, Michael P. – TRAVERSO, Paolo – DUSTDAR, Schahram et al. 2007. Service-Oriented Computing: State of the Art and Research Challenges. IEEE Computer, November 2007, vol. 40, no. 11, p. 64–71. ISSN 0018-9162. PAPAZOGLOU, Michael P. 2008. Web services: Principles and technology. Harlow: Prentice Hall, 2008. 784p. ISBN 978-0-321-15555-9. PERRY, Dewayne E. – WOLF, Alexander L. 1992. Foundations for the Study of Software Architecture. ACM SIGSOFT Software Engineering Notes. Oct. 1992, vol. 17, no. 4, p.40–52. ISSN 0163-5948. PITNER, Tomáš – DERNTL, Michael – HAMPEL, Thorsten et al. 2007. Web 2.0 as Platform for Inclusive Universal Access in Cooperative Learning and Knowledge Sharing. Journal of Universal Computer Science, 2007, vol. 13, no. 1, p.49–56. ISSN 0948-695X. PITNER, Tomáš – DRÁŠIL, Pavel – HINCA, Martin. 2009. Web 2.0. In Proceedings of the Annual Database Conference DATAKON 2009. Praha: Oeconomica, 2009. p. 195–218. ISBN 78‑80‑245‑1568‑7.

- 122 - PIZETTE, Larry – SEMY, Salim – RAINES, Geoffrey et al. 2009. A Perspective on Emerging Industry SOA Best Practices [online]. October 2009 [cit. 2010-05-12]. Technical paper. The MITRE Corporation. 32p. Available from . POSHYVANYK, Denys – MARCUS, Andrian. 2006. The Conceptual Coupling Metrics for Object- Oriented Systems. In Proceedings of the 22nd IEEE International Conference on Software Maintenance. Philadelphia: IEEE Computer Society, 2006. p.469–478. ISBN 0-7695-2354-4. PRESSMAN, Roger S. 2001. Software Engineering: A practitioner’s approach. Boston: McGraw Hill, 2001. 840p. Fifth edition. ISBN 0073655783. RADEMAKERS, Tijs – DIRKSEN, Jos. 2009. Open Source ESBs in Action: Example implementations in Mule and Servicemix. Greenwich: Manning Publications, 2009. ISBN 1933988215. RFC 707. 1976. A High-Level Framework for Network-Based Resource Sharing [online]. James E. White. 1976 [cit. 2010-03-02]. 30p. Available from .‌ ISSN 2070‑1721. RFC 1939. 1996. Post Office Protocol – Version 3 [online]. J. Myers, M. Rose. 1996 [cit. 2011‑01‑05]. 23p. Available from .‌ ISSN 2070‑1721. RFC 2616. 1999. Hypertext Transfer Protocol – HTTP/1.1 [online]. R. Fielding, J. Gettys, J. Mogul et al. 1999 [cit. 2011‑01‑05]. 176p. Available from .‌ ISSN 2070‑1721. RFC 2617. 1999. HTTP Authentication: Basic and Digest Access Authentication [online]. J. Franks, P. Hallam-Baker, J. Hostetler et al. 1999 [cit. 2011‑01‑05]. 34p. Available from .‌ ISSN 2070‑1721. RFC 2821. 2001. Simple Mail Transfer Protocol [online]. J. Klensin (ed.). 2001 [cit. 2011‑01‑05]. 79p. Available from .‌ ISSN 2070‑1721. RFC 3501. 2003. Internet Message Access Protocol, version 4rev1 [online]. M. Crispin. 2003 [cit. 2011‑01‑05]. 108p. Available from . ISSN 2070‑1721. RFC 3920. 2004. Extensible Messaging and Presence Protocol (XMPP): Core [online]. P. Saint‑Andre (ed.). 2004 [cit. 2011‑01‑05]. 90p. Available from . ISSN 2070‑1721. RFC 4287. 2005. The Atom Syndication Format [online]. M. Nottingham, R. Sayre (eds.). 2005 [cit. 2011‑01‑05]. 43p. Available from .‌ ISSN 2070‑1721. RFC 4627. 2006. The application/ Media Type for JavaScript Object Notation (JSON) [online]. D. Crockford. 2006 [cit. 2011‑01‑05]. 10p. Available from . ISSN 2070‑1721. RFC 5023. 2007. The Atom Publishing Protocol [online]. J. Gregorio, B. de hOra (eds.). 2007 [cit. 2011‑01‑05]. 53p. Available from .‌ ISSN 2070‑1721. RFC 5849. 2010. The OAuth 1.0 Protocol [online]. Eran Hammer-Lahav (ed.), 2010 [cit. 2010-07-21]. 39p. Available from . ISSN 2070‑1721.

- 123 - ROBINSON, Rick. 2008. Enterprise Web 2.0, Part 2: Enterprise Web 2.0 solution patterns [online]. IBM: IBM developerWorks, 2008-02-07 [cit. 2011-01-14]. Available from . ROSEN, Michael – LUBLINSKY, Boris – SMITH, Kevin T. et al. 2008. Applied SOA: Service-Oriented Architecture and Design Strategies. Indianopolis: Wiley, 2008. 696p. ISBN 978‑0‑470‑22365-9. ROSENBERG, Florian – DUSTDAR, Schahram. 2005. Business Rules Integration in BPEL – A Service- Oriented Approach. In Proceedings of the 7th International IEEE Conference on E-Commerce Technology (CEC’05). Munich: IEEE Computer Society, 2005. p 476–479. ISBN 0-7695-2277-7. ROZANSKI, Nick – WOODS, Eóin. 2005. Software Systems Architecture: Working With Stakeholders Using Viewpoints and Perspectives. Upper Saddle River: Addison-Wesley, 2005. 546p. ISBN 0-321‑11229‑6. RSS Advisory Board. 2009. RSS 2.0 specification[online]. Version 2.0.11. 2009-03-30 [cit. 2011‑01‑03]. Available from . RUMBAUGH, James – JACOBSON, Ivar – BOOCH, Grady. 2004. The Unified Modeling Language Reference Manual, 2nd edition. Boston: Addison-Wesley, 2004. 752p. ISBN 0321245628. SCHROTH, Christoph – JANNER, Till. 2007. Web 2.0 and SOA: Converging Concepts Enabling the Internet of Services. IT Professional. May–June 2007, vol. 9, no. 3, p. 36–41. ISSN 1520-9202. SCHULTE, Roy W. – NATIS, Yefim V. 1996. SSA Research Note SPA-401-068, “Service Oriented” Architectures, Part 1, April 1996. Technical report. The Gartner group. SHAW, Mary – CLEMENTS, Paul. 2006. The Golden Age of Software Architecture. IEEE Software, March–April 2006, vol. 23, no. 2, p.31–39. ISSN 0740-7459. SHUJA, Ahmad K. – KREBS, Jochen. 2008. IBM Rational Unified Process Reference and Certification Guide. Upper Saddle River: IBM Press, 2008. 336p. ISBN 0-13-156292-4. SOLEY, Richard. 2008. SOA – A Business Agility Strategy. In SOA World Magazine [online]. SYS-CON Media, 2008-01-17 [cit. 2010-03-22]. Available from . SONI, Dilip – NORD, Robert L. – HOFMEISTER, Christine. 1995. Software Architecture in Industrial Applications. In Proceedings of the 17th International Conference on Software Engineering. Seattle: ACM Press, 1995. p.196–207. ISBN 0-89791-708-1. STAMPER, Jason. 2005. “ESB Inventor” Riddle Solved? In Computer Business Review magazine [online]. 2005-08-04 [cit. 2010-04-20]. Available from . STEPHANIDIS, Constantine. – SAVIDIS, Anthony. 2001. Universal Access in the Information Society: Methods, Tools, and Interaction Technologies. Universal Access in the Information Society, 2001, vol. 1, no. 1. p.40–55. ISSN 1615-5289 (print), 1615-5297 (online). SVOBODA, Radek. 2009. Srovnání dostupných implementací specifikace JBI [Comparison of available JBI implementations]. Brno, 2009. Diploma thesis (Mgr.). 74p. Masaryk university, Faculty of Informatics. Available from . SZYPERSKI, Clemens – PFISTER, Cuno. 1997. Workshop on Component-Oriented Programming, Summary. In MÜHLHÄUSER, Max (ed.): Special Issues in Object-Oriented Programming: Workshop reader of the 10th European Conference on Object-oriented Programming, ECOOP ‘96, Linz, July 1996. Heidelberg: Dpunkt Verlag, 1997. p.127–130. ISBN 3-920993-67-5.

- 124 - SZYPERSKI, Clemens. 1999. Component software: Beyond Object-Oriented Programming. Harlow: Addison Wesley, 1999. ISBN 0-201-17888-5. TAYLOR, Richard N. – MEDVIDOVIC, Nenad – DASHOFY, Eric M. 2009. Software Architecture: Foundations, Theory, and Practice. USA: Wiley, 2009, 712p. ISBN 978‑0‑470‑16774‑8. TERLOUW, Joeri – TERLOUW, Linda – JANSEN, Slinger. 2009. An Assessment Method for Selecting an SOA Delivery Strategy: Determining Influencing Factors and Their Value Weights. In Proceedings of the 4th International Workshop on Business/IT Alignment and Interoperability, Amsterdam, June 2009. CEUR Workshop Proceedings, vol.456. ISSN 1613-0073. The Open Group. 2006. Definition of SOA [online]. Version 1.1. June 2006 [cit. 2010-03-22]. Available from . The Open Group. 2009. SOA Reference Architecture [online]. Draft 10. April 2009 [cit. 2010‑05‑12]. Available from . VAN DER VLIST, Eric – AYERS, Danny – BRUCHEZ, Erik et al. 2007. Proffesional Web 2.0 Programming. USA: Wrox, 2007. 552p. ISBN 0-470-08788-9. W3C. 1998. Extensible Markup Language (XML) 1.0 [online]. February 1998 [cit. 2010-04-15]. Available from . W3C. 2003. SOAP Version 1.2 Part 0: Primer [online]. June 2003 [cit. 2010-04-06]. Available from . W3C. 2004a. Web Services Architecture [online]. February 2004 [cit. 2010-04-06]. Available from . W3C. 2004b. Web Services Glossary [online]. February 2004 [cit. 2010-03-22]. Available from . W3C. 2007. Web Services Description Language (WSDL) Version 2.0 Part 0: Primer [online]. June 2007 [cit. 2010-04-06]. Available from . W3C. 2009. Web Application Description Language [online]. August 2009 [cit. 2011-01-11]. W3C Member Submission. Available from . WELSH, Matt. 2002. An Architecture for Highly Concurrent, Well-Conditioned Internet Services. Berkeley, 2002. Dissertation (Ph.D.). 211p. University of California at Berkeley. Available from . XML-RPC Specification [online]. 1999. Dave Winer, 1999 [cit. 2010-03-03]. Available from . ZIMMERMANN, Olaf – KROGDAHL, Pal – GEE, Clive. 2004. Elements of Service-Oriented Analysis and Design: An interdisciplinary modeling approach for SOA projects [online]. IBM: IBM developerWorks, 2004‑06‑02 [cit. 2010-05-10]. Available from .

- 125 - Glossary

Enterprise 2.0 A concept of using tools and services that employ Web 2.0 techniques such as tagging, ratings, networking, RSS, and sharing in the context of the enterprise.

Enterprise mashup A mashup used in an enterprise environment, possibly aggregating resources both internal and external to the enterprise, with enhanced security, availability and quality features. Some authors use the term business mashup instead.

Mashup A web page or application that uses and combines data, presentation or functionality from two or more sources to create new services.

Software framework A reusable design for a software system (or subsystem).

Software service A mechanism to enable access to one or more capabilities, where the access is provided using a prescribed interface and is exercised consistent with constraints and policies as specified by the service description.

Service integration A process of allowing individual services to cooperate by invoking each other’s operations.

Service choreography A style of service composition, associated with the public (globally visible) message exchanges, rules of interaction, and agreements that occur between multiple business-process endpoints. UML activity diagrams or Petri nets can be used to describe the choreography.

Service composition A process of combining and linking multiple services to form a new, aggregate service, providing a previously non-existing functional construct(s).

Service orchestration A style of service composition, describing both the communication actions and the internal actions of a single orchestration controller (e.g. by BPEL). Internal actions may include data transformations and invocations to internal software modules (e.g., legacy applications that are not exposed as services). Orchestrations are executable processes, intended to be executed by an orchestration engine.

Web 2.0 A business revolution in the computer industry caused by the move to the Internet as platform, and an attempt to understand the rules for success on that new platform. Chief among those rules is this: Build applications that harness network effects to get better the more people use them.

Widget A standalone web application, embeddable into third party sites by their authors. Some authors use the term gadget instead.

World Wide Web A system of interlinked hypertext documents accessed via the Internet.

- 126 -