Syndicate Framework 55 Table of Contents Ix

Individual Service Composition in the Web-age Inauguraldissertation zur Erlangung der Wurde¨ eines Doktors der Philosophie vorgelegt der Philosophisch-Naturwissenschaftlichen Fakultat¨ der Universitat¨ Basel von Sven Rizzotti aus Basel Basel, 2008 Genehmigt von der Philosophisch-Naturwissenschaftlichen Fakultätauf Antrag von Herrn Prof. Dr. Helmar Burkhart, UniversitätBasel und Herrn Prof. Dr. Gustavo Alonso, ETH Zürich, Korreferent. Basel, den 11.12.2007 Prof. Dr. Hans-Peter Hauri, Dekan To my parents, Heide and Fritz Rizzotti and my brother Jörg. On the Internet, nobody knows you're a dog. Peter Steiner cartoon in The New Yorker (5 July 1993) page 61 Abstract Nowadays, for a web site to reach peak popularity it must present the latest information, combined from various sources, to give an interactive, customizable impression. Embedded content and functionality from a range of specialist fields has led to a significant improvement in web site quality. However, until now the capacity of a web site has been defined at the time of creation; extension of this capacity has only been possible with considerable additional effort. The aim of this thesis is to present a software architecture that allows users to personalize a web site themselves, with capabilities taken from the immense resources of the World Wide Web. Recent web sites are analyzed and categorized according to their customization potential. The results of this analysis are then related to patterns in the field of software engineering and from these results, a general conclusion is drawn about the requirements of an application architecture to support these patterns. A theoretical concept of such an architecture is proposed and described in detail. The empirical part of the study includes an implementation of the proposal and a demonstration of the assembly of capabilities found in the World Wide Web. This implementation is based on established technologies but applies them to a new, specially-designed structure. It allows users to add selected facilities to arbitrary web sites simply by calling a specific web address. This gives the user the potential to adapt the appearance and function of web sites to his or her personal needs. An in-depth analysis of the challenges and restrictions of the software design completes the proposed architecture. Practical examples of behavior patterns show possible implementations in a range of fields. Finally, a vision developed from the results presented in this thesis is outlined and subjects for future research are examined. Keywords: collaborative software, mashup, rich internet applications, semantic web, service-oriented architecture, web services, web 2.0, web 3.0 iii Acknowledgements I would like to express my gratitude to people involved in my research process. Firstly my supervisors deserve my acknowledgements. I would like to thank Professor Helmar Burkhart and Professor Gustavo Alonso for encouraging me to commence the dissertation research process and also for making it possible financially. Professor Helmar Burkhart has provided me with invaluable guidance during the research and I am very grateful for all his efforts and for the time he has given me. I would like to thank all the co-workers at the Department of Computer Science and Pharmacy, with whom I have discussed the various aspects of conducting research for dissertation. The discussions with Christina Weber, Alexander Vögtli,Martin Guggisberg, and Tibor Gyalog have particularly influenced the progress of my research and I would like to thank them for patiently listening to my torrent of words from time to time. I would also like to express my warmest thanks to Robert Frank, Beat Ernst, Pascal Betz and Adrian Ensner, with whom I have had many fruitful discussions. I am also very grateful to Andrew Brown, Stefanie Mahrer and Eric Fischer for their valuable time reading and discussing my writings. The support provided by my family and closest friends has been truly indispensable throughout the project. Urs and Vera Schlittler made the writing process a delightful experience by hosting me in the wonderful Swiss alps and especially Ruth Schlittler deserves my warmest thanks for her endless support and understanding. Financial support for the printing of this thesis was provided by the dissertation fund of the University of Basel. Basel, June 10, 2008 v Table of Contents Abstract iii Introduction1 I Fundamentals7 1 Evolution in Usage of the Web9 1.1 Before the World Wide Web.................9 1.2 From information silos to rich interfaces.......... 10 1.3 Rich web applications.................... 11 1.4 Web applications vs. desktop applications......... 11 1.5 Summary........................... 13 2 Distinguishing Elements in the Web Environment 15 2.1 Overview........................... 15 2.2 Monolithic software architecture.............. 15 2.3 Service-oriented architectures................ 16 2.3.1 Approach to service-oriented architecture..... 17 2.3.2 Approach to service-oriented architecture..... 18 2.4 Web-based services...................... 19 2.4.1 Business models................... 19 2.4.2 Summary: The community effect.......... 22 2.4.3 Web 2.0........................ 22 2.4.4 Mashups........................ 23 2.5 Browser technologies..................... 26 2.5.1 Browser extensions.................. 26 vii viii Table of Contents 2.5.2 Plugins........................ 26 2.5.3 Themes........................ 27 3 Web Technologies 29 3.1 Overview........................... 29 3.2 Server-side........................... 29 3.2.1 Server-side scripts.................. 29 3.2.2 Server pages..................... 30 3.3 Client-side........................... 32 3.3.1 JavaScript....................... 32 3.3.2 Flash......................... 33 3.3.3 Laszlo, OpenLaszlo, DHTML............ 34 3.3.4 Java Applets..................... 34 3.4 Communication........................ 35 3.4.1 AJAX......................... 35 3.4.2 SOAP and related standards............ 36 3.4.3 REST......................... 39 3.4.4 Alternatives to SOAP and REST.......... 40 4 Problem Description 41 4.1 Categories of individualization............... 42 4.2 Software architecture to support category-based personal- ization............................. 43 4.3 Challenge of this work.................... 44 5 Related Projects and Assessment 47 5.1 Projects based on Firefox Extensions............ 47 5.2 Proxy-based projects..................... 51 5.3 Other related projects.................... 52 5.4 Summary of evaluated projects............... 53 II Syndicate Framework 55 Table of Contents ix 6 Concepts and Terminology 57 6.1 Pattern-based individualization............... 57 6.1.1 Content pattern................... 58 6.1.2 Presentation pattern................. 58 6.1.3 Integration pattern.................. 59 6.2 Manifold { The subsequent intention............ 60 6.2.1 Advantages...................... 62 6.2.2 Requirements on the applications.......... 63 6.3 Use cases and basic architecture............... 63 6.4 Terminology.......................... 64 6.5 Involved servers........................ 66 6.6 Architecture principles.................... 66 6.7 Summary........................... 69 7 Syndicate Units 71 7.1 Syndicate components.................... 71 7.1.1 Syndicator and page states............. 72 7.1.2 Syndicate server.................... 72 7.1.3 Mission........................ 75 7.1.4 Mission Patterns................... 75 7.1.5 Mission triggers.................... 76 7.1.6 Properties....................... 77 7.2 Service unit.......................... 78 7.2.1 Service......................... 78 7.2.2 Server detour..................... 78 7.3 Transformer unit....................... 80 7.3.1 Adapter........................ 80 7.3.2 Transformer...................... 80 7.4 Exploration unit....................... 80 7.4.1 Scout......................... 80 7.5 Data access unit....................... 81 7.5.1 Source selector.................... 81 7.5.2 Connector....................... 82 x Table of Contents 7.6 Controlling unit........................ 82 7.7 Output viewer........................ 83 7.7.1 Renderer....................... 83 7.7.2 Injector........................ 83 7.7.3 DOM......................... 83 7.7.4 External programs.................. 84 8 Syndicate Control Flow 85 8.1 Syndication of a page.................... 86 8.2 Mission triggering...................... 87 8.3 Mission activation...................... 88 8.3.1 Local Consigliere................... 88 8.3.2 Remote Consigliere via server detour........ 89 8.3.3 Information gathering with the Scout....... 89 8.3.4 Transformation and Rendering........... 89 8.3.5 Injection........................ 90 III Syndicate as seen by the User 91 9 Mission Publisher 93 9.1 Mission components..................... 93 9.1.1 Summary....................... 93 9.1.2 Implementation.................... 94 9.1.3 Mission Properties.................. 95 9.2 Mission Unpublishing.................... 97 9.3 Publishing workflow..................... 98 10 Mission Subscriber 99 10.1 Mission subscription..................... 99 10.2 Mission selection....................... 99 10.3 Mission unsubscription.................... 100 11 Examples 101 Table of Contents xi 11.1 Presentation Mission examples............... 103 11.1.1 AdRemover...................... 103 11.1.2 TextFieldSize..................... 104 11.1.3 TableSorter...................... 105 11.1.4 FileExtension..................... 105 11.1.5 iTunes......................... 105 11.1.6

Syndicate Framework 55 Table of Contents Ix

An Overview of the 50 Most Common Web Scraping Tools

Web Data Extraction

Rich Internet Applications

Open Source Web GUI Toolkits

Design of a Service-Oriented Dashboard

Detecting Conflicts Among Declarative UI Extensions

Leukemia Medical Application with Security Features

Open Source Bridge Touchscreen Presentation

Design of Imacros-Based Data Crawler and the Behavioral Analysis of Facebook Users

Learning PHP 5 by David Sklar

Enea® Linux Open Source Report

Vector Graphics Computer Science Masters Dissertation