Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

CASCON technology showcase CASCON technical papers An amazing range of technology innovations as The technical papers program features experience well as a showcase of research results. Come to reports and original research papers. Best Paper browse the displays and talk to developers and and Best Student Paper awards will be presented Registration is now closed. CASCON 2009 will take place researchers from industry and academia. at Tuesday morning's keynote session. on November 2-5. Learn more Learn more CASCON 2008 Highlights

Highlights and video on CASCON speakers ITWorldCanada CASCON workshops Keynote and Frontiers of Software Practice More info From hands-on learning to tutorials to panel presentations by innovators and thought-leaders discussions, CASCON workshops give you a great on topics of world-changing impact. Don't miss Read the highlights article opportunity to acquire new skills and participate in them! View (28kb) lively discussions. Learn more Get Adobe® Reader® Learn more

Navigation and resources

CASCON 2008 Resources Related links CASCON events CASCON 2008 Videos Registration & sign-in Hotel information IBM University Relations CASCONcamp demo camp Watch full-length videos from Full papers Sponsors Programming Contest High school competition the conference CASCON 2008 Program Contacts Central Click here Committee

CASCON Proceedings CASCON Proceedings are available on the

ACM Digital Library

Related information CASCON archives

CASCON mailing list Subscribe/Unsubscribe to CASCON mailing list

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

CASCON technology showcase CASCON technical papers An amazing range of technology innovations as The technical papers program features experience well as a showcase of research results. Come to reports and original research papers. Best Paper browse the displays and talk to developers and and Best Student Paper awards will be presented Registration is now closed. CASCON 2009 will take place researchers from industry and academia. at Tuesday morning's keynote session. on November 2-5. Learn more Learn more CASCON 2008 Highlights

Highlights and video on CASCON speakers ITWorldCanada CASCON workshops Keynote and Frontiers of Software Practice More info From hands-on learning to tutorials to panel presentations by innovators and thought-leaders discussions, CASCON workshops give you a great on topics of world-changing impact. Don't miss Read the highlights article opportunity to acquire new skills and participate in them! View (28kb) lively discussions. Learn more Get Adobe® Reader® Learn more

Navigation and resources

CASCON 2008 Resources Related links CASCON events CASCON 2008 Videos Registration & sign-in Hotel information IBM University Relations CASCONcamp demo camp Watch full-length videos from Full papers Sponsors Programming Contest High school competition the conference CASCON 2008 Program Contacts Central Click here Committee

CASCON Proceedings CASCON Proceedings are available on the

ACM Digital Library

Related information CASCON archives

CASCON mailing list Subscribe/Unsubscribe to CASCON mailing list

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

CASCON Technology Showcase Database Technologies Education and Information Registration is now closed. Middleware Technologies CASCON 2009 will take place Services Science and SOA on November 2-5. User Technologies CASCON 2008 Highlights Highlights and video on ITWorldCanada

Database technologies More info An XML Index Advisor for DB2 Read the highlights article XML indexes can significantly improve workload execution performance. We have developed an XML Index Advisor for DB2 that View (28kb) helps in choosing the best indexes configuration for a given workload. In this demonstration, we showcase two new DB2 optimizer modes, the index recommendation process, and the effectiveness of the recommended indexes. Get Adobe® Reader®

[email protected]

http://www.cs.uwaterloo.ca/~ielghand/

Cost-Aware Dynamic Provisioning for Performance and Power Management We introduce a novel cost-aware dynamic provisioning approach for the database tier of a dynamic content site. Our approach CASCON 2008 Videos employs SVM regression for learning an adaptive system model. We leverage this model for a cost-aware provisioning technique Watch full-length videos from using a utility function expressing monetary costs for both performance and power. the conference

madalin@cs..edu Click here

DescribeX: Pattern Discovery in Large XML Collections Where an XML collection's structure is not represented by its schema, structural summaries are coarse partitions that can be CASCON Proceedings refined to reveal substructures. DescribeX is an interactive Eclipse-based tool for creating and refining structural summaries CASCON Proceedings are formulated using "axis path regular expressions" (AxPREs) that is capable of handling multi-gigabyte XML collections. available on the

[email protected] ACM Digital Library Dynamic Load Balancing using DB2 9.5 for Linux, UNIX, Windows (LUW) Data Servers and WebSphere Edge Components Related information Edge components, part of the WAS Network Deployment (ND) offering, are typically used to control client access to Web servers. CASCON archives In this exhibit, we demonstrate how Edge components can be used for availability, to load balance DB2 Clients among multiple DB2 Servers based on performance metrics, or using access control rules. CASCON mailing list adesilva@ca..com Subscribe/Unsubscribe to Enforcement of Performance Objectives for Database Workload Classes CASCON mailing list Competing workloads in a database management system need differentiated levels of service for a better performance. We present an external mechanism, which, in conjunction with DB2's comprehensive workload management features (WLM), will control the throttling provided by WLM to achieve customer-specified performance objectives for different classes of work. [email protected]

Index Structures for XML Databases Survey of the available XML data indexing schemes advantages and disadvantages in terms of size and scalability, query answering power, and updatability. We present areas of potential future research in this technology, as well as the challenges of indexing XML data.

[email protected]

Lightweight Problem Determination for DB2 In our project, we explore the feasibility and performance of lightweight problem determination based on stream mining techniques. We have developed a problem determination tool based on these techniques to minimize the expensive consumption of system resources and frequent interference with observed systems.

[email protected]

Meta-Heuristic Based Tests of DB2 Optimizer This project aims to define and implement the process of SQL test case generation by using meta-heuristics. Different evolutionary approaches like ant colony or genetic algorithm will be also considered. The selection and design of test cases to be generated will be addressed in the context of static analysis of source languages, static analysis, grammar analysis, and impact analysis.

[email protected]

Back to top

Education and Information IBM Academic Initiative The IBM Academic Initiative is an innovative program to partner with colleges and universities worldwide to better educate students for a more competitive information technology workforce. The Academic Initiative provides a broad range of offerings and benefits to professors and students, including the latest technologies in IBM software, hardware, course materials, training, technical support, and other resources.

[email protected]

www.ibm.com/university/academicinitiative/

IBM High School Programming Contest Central Student-run initiative to promote computer science and encourage high school students to get involved in programming contests.

[email protected]

Lessons Learned: Database Applications as Learning Vehicles for College Students Seneca College students in a Computer Engineering database course have built simple applications in small teams to enhance and demonstrate their SQL, process, I/F and team work skills. About fifty such projects have been reviewed and some insights have been gained into how these experiential learning experiences contribute towards graduates' IT careers.

[email protected]

www.senecac.on.ca/cms/

Leveraging Diversity: A Solution to the Labour and Skills Shortages in the ICT & IT Sectors The exhibit is based on a report that offers solutions to the labour shortage and critical skills shortage problems that the IT & ICT sectors face. This is important for CASCON audience members who are interested in their companies' long-term survival in the 'competition for global talent'.

[email protected]

Managing DITA Relationship Tables DITA relationship tables are both one of the clearest strengths of authoring in DITA, as well as one of the most common points of frustration. A new tool being designed and developed by the WebSphere Business Modeler Information Development team aims to simplify the creation and maintenance of relationship table information for technical writers and other people using DITA to store information.

[email protected]

Scientific Software, Silent Errors, and Mutation Testing Scientific software needs to be carefully tested for accuracy. Assessing accuracy is a complex problem, but one of the components of highly accurate software is highly correct code. We have applied mutation testing techniques to scientific codes to help us understand how to approach testing of these codes.

[email protected]

Back to top Middleware Technologies

A Fast, Scalable Optimization Procedure for Configuring Service Systems with Cost and Quality of Service Constraints A scalable hybrid optimization procedure for deploying services on nodes in large-scale service systems. The approach can be used in three ways: (1) to search for a feasible solution which meets both cost and performance constraints (related to contracts for QoS), (2) to minimize cost subject to meeting a set of performance constraints, or (3) to optimize some combined measure of quality of service subject to cost constraints.

[email protected]

A Modular Event-Based Architecture for Workflow Systems Workflow is considered as an essential technique to integrate distributed applications. Two conflicting goals are important in the workflow systems: flexibility and simplicity. We address these two issues by introducing a novel architecture for workflow systems. Modules and templates provide simplicity while events and event handlers make workflow systems flexible.

[email protected]

www.cas.mcmaster.ca/~najafm/

A Policy-Based Negotiation Broker Middleware for Automated SLA Negotiation for Web Services Negotiation of SLAs is very important for maintaining QoS of composite Web services-based business processes. We present a Negotiation Broker (NB) middleware framework to facilitate automated negotiations of SLAs for Web services as a trusted broker using time-based cost-benefit negotiation strategy models.

[email protected]

research.cs.queensu.ca/home/farhana

ACC: www.AspeCtC.net AspeCt-oriented C (ACC) is a research project at the University of Toronto that enables aspect-oriented software development with the C programming language. The ACC compiler translates ACC code into standard ANSI-C code which can then be compiled by any ANSI-C compiler such as gcc. ACC lets developers modularize crosscutting concerns in C programs.

[email protected]

Architecture Related Defects Analysis in a Complex Software System This poster concerns defects originating in architecture. Initial empirical studies will be illustrated, and may help researchers to better understand the role of architecture in maintenance, as well as help maintainers and developers to make defect fixing and module repairing more targeted.

[email protected]

Automating SLA Modeling A configurable, reusable, extensible and inheritable SLA model that provides great flexibility in constructing complex SLAs. As well, artifacts can be automatically generated to monitor an executing business process at runtime against an SLA. The approach is designed to require minimal human intervention.

[email protected]

Autonomic Computing System in Virtual Environment This project applies the IBM Autonomic Computing architecture to virtual environment on system resource management. A web- based service system has been built from virtual images of IBM HTTP Server, WebSphere Application Server (WAS), DB2 Server and Tivoli Provisioning Manager (TPM) Server. The system is able to grow and shrink dynamically according to the service level agreements (SLA) and the current workloads. If the resources (i.e. WAS in this case) are shared by different applications, they will be reallocated properly as well. The autonomic manager consists of a queuing performance model and a tracking filter. It is able to update the system status and predict the future behavior more precisely.

[email protected]

Coconut Coconut (COde CONstructing User Tool) is a platform for experimenting with novel ideas in reliable and high performance code generation currently targeting the Broadband Engine. Both SIMD level and multicore level parallelism are targeted for optimization and the latest performance results from each are presented.

[email protected]

www.cas.mcmaster.ca/~anand

Comparing Policy-Based Management Approaches for Distributed Systems Policy-Based Management aims to simplify the management of systems by specifying rules that govern the system, it can be classified according to the framework with which they are built. We compare the Policy-Based Management approaches for Distributed Systems according to an evaluation that we define. We also talk about our future work in the area of Policy-Based Management for Distributed Systems.

[email protected]

research.cs.queensu.ca/home/cords

Computer Assisted Root Cause Analysis This project helps to discover the root cause of failures in complex enterprise application software. One issue is balancing information overload against tunnel vision while investigating alerts. We present a search-based workflow based on several analysis techniques, including goal models, trace analysis, fault trees, and fish-bone diagrams.

[email protected]

www.cs.ualberta.ca/~kenw/

Discovering Faulty Function from Field Traces Corrective maintenance takes up approximately 20 to 36% of software maintenance activities. Identification of the origin of defect - fault origin (component, function) - takes approximately 20 to 40% of the time of the corrective maintenance. We demonstrate a tool/technique to automatically identify faulty function from method level traces captured in the field.

[email protected]

Gutsy: A Service-Oriented Open Source Project Monitoring Application Tool/Device Gutsy (Guide Tool for System) is a service-oriented remote, continuous analysis, quality assurance system designed to help developers effectively cope with software evolution. Gutsy is currently monitoring the open source software application GRASS geographical information system. We are also beta testing the C++ GDAL library.

[email protected]

web.soccerlab.polymtl.ca

IBM Desktop Management Toolkit The Desktop Management Toolkit is being researched and developed by IBM to help our enterprise customers to manage the installed software on their systems.

[email protected]

Identifying Failures in a Large-Scale Software System using Pattern-Discovery Methods and other Machine Learning Techniques It is critical to quickly locate and solve the cause of failures and performance-defects in an enterprise software system to avoid significant losses. We use pattern-discovery methods to locate anomalies by comparing transaction data obtained from a fault-free state and an abnormal state of the system, and then apply other machine-learning techniques to learn about and identify recurrent problems. We evaluate this approach using ARM traces obtained from clustered IBM WebSphere instances.

[email protected]

Intelligent Monitoring of Large-Scale Software Systems Software systems are critical to business success, and thus require continuous monitoring. However, detailed monitoring hurts system performance. We have developed adaptive approaches to monitoring and root cause analysis to ensure that complex software systems remain healthy and available at minimal cost. We leverage log files, management metrics (e.g., JMX), and trace data (e.g., ARM).

[email protected]

J2C Connector Tools, Rational® Application Developer A new technology of how the J2EE applications can communicate to the back end enterprise systems like SAP, Siebel, PeopleSoft, JDS Edwards via inbound and outbound processing. We will also present the differences between inbound and outbound processing.

[email protected]

JSCOOP: A Framework Toward Fair Concurrency in Java SCOOP is a concurrent language with the idea of extending O-O in a minimal way to add concurrency support. We have developed JSCOOP, an equivalent solution for Java. JSCOOP is an Eclipse plug-in that introduces new annotations (modeled after SCOOP keywords) and a core library providing support for SCOOP semantics.

[email protected]

www.cse.yorku.ca/~faraz

Monitoring and Diagnosing Software Requirements A framework for monitoring and diagnosing the satisfaction of software requirements. The monitoring component generates log data based on the requirement model. The diagnostic component analyzes the log data and diagnoses denials of software requirements. Monitoring and diagnostic capabilities are essential to the design of self-adaptive, autonomic systems.

[email protected]

Mutation-Based Testing of Buffer Overflows, SQL Injections, and Format String Bugs The application of the idea of mutation-based adequate testing to perform vulnerability testing of buffer overflows, SQL injections, and format string bugs. This approach complements traditional techniques for fixing software vulnerabilities early and enhances software quality.

[email protected]

research.cs.queensu.ca/~qrst/

Performance Management for IT Infrastructure Automated resource provisioning for clusters under mixed workload consists of interactive and batch jobs. We present algorithms to dynamically allocate computing resources to each cluster such that SLAs of all job classes are met, and the cost is kept to a minimum.

[email protected]

Quality Engineering for Large Scale Enterprise Systems Large scale enterprise systems must support a large number of users. However, ensuring the scalability of these systems is not easy, due to limited research efforts in the following areas: (pre-release) load testing, (field-deployment) capacity planning, and (post-release) customer issue resolution. In this poster, we showcase our research efforts in the aforementioned areas.

[email protected]

Rational Enterprise Reporting A new reporting solution from IBM Rational the pulls data from disparate systems, normalizes it, and allows for reports to be generated from it. This solution targets software development projects.

[email protected]

Runtime Monitoring of Web Service Conversations This CAS project involves the development of an monitoring framework for web applications, where errors are reported as they are detected. However, at runtime, is it satisfactory to just report the error? In this poster, we discuss how to add recovery and compensation mechanisms to our framework.

[email protected]

www.cs.toronto.edu/~jsimmond

SE-ADVISOR The SE-ADVISOR tool presents a novel approach to support software evolution, by integrating maintenance relevant knowledge resources, processes, and their constituents. We demonstrate how our SE-ADVISOR environment can provide contextual guidance during typical maintenance tasks through the use of ontological queries and reasoning services.

[email protected]

www.cse.concordia.ca/~rilling

Servus: Model-Based Generation and Evolution of Web Services Servus is proposed as a platform to provide support for modeling, generation and evolution of Web Services. It defines a profile for Eclipse Modeling Framework's Ecore and implements bi-directional mappings between such models and Web Services artifacts, such as WSDL documents and Java code. Furthermore, a library allows services described by Servus models to be dynamically deployed and accessed. Servus is part of a CAS PhD Fellowship, developed under EMFT.

[email protected]

Supporting Dynamic Publish and Subscribe in a Service Oriented Architecture A framework for supporting dynamic publish/subscribe in a service oriented architecture using a policy-based message broker. We applied this model to a palliative health care scenario to provide a dynamic data sharing platform for real-time monitoring of a patient's health status with health care providers operating in various domains. Declarative policies provide the flexibility for data sharing, transformation, reporting, and collaboration.

[email protected]

The Application of Data Mining Techniques in Software Fault Localization Automated fault localization is the process of automatically inferring likely code locations of faults from data about test case failures. We exhibit the results of research that explores the possibility of doing fault localization by using data mining techniques to mine program execution data.

[email protected]

Toward an Interoperable e-Health Environment We are currently involved in a real-world project to integrate two legacy healthcare systems using HL7 v3. Our poster will outline the various challenges faced during the project, processes, frameworks and formalizations we developed to address those and work-in-progress of a user friendly tool to support the HL7 integration process.

[email protected]

Using Umple for Rapid Prototyping The Umple languages allow rapid textual entry of models. Jumple allows rapid entry and modification of UML class diagrams with integrated Java code, while Bumple allows textual entry of BPEL. We will demonstrate the use of these on substantial problems. In particular, we will show how easy it is to textually edit the models and see the changed diagrams in IBM Rational and WebSphere tools.

[email protected]

www.site.uottawa.ca/~tcl

Variability Pre-Processing in Model-Driven Testing for a Domain Proposed is a framework to build variability knowledge for a domain, and then to extract and merge specific variabilities from it into the domain's testable model, where the latter one leads to generate tests for the domain. The objective is to enable the ongoing research on model-driven testing for a domain: to apply requirements' changes on the relevant parts of the model, and to reuse variability knowledge for other new testable models of the domain. [email protected]

www.scs.carleton.ca/~sbtajali

Workload Consolidation in Virtualized Environments A novel technique for modeling virtual machine resource demands in a shared data center. By leveraging expert knowledge and analytical optimization techniques, we are able to train the model with a small number of samples. Thus, our model can adapt swiftly to unexpected changes in workloads.

[email protected]

Back to top

Services Science and SOA A Peer-to-Peer Desktop Cloud The goal of this research is to create a stable, predictable, peer-to-peer cloud infrastructure which leverages idle CPU cycles. This system aims to provide a business model for desktop users who volunteer spare CPU cycles and for organizations which are in need of massive computing power.

[email protected]

www.csd.uwo.ca/~kramach

Beacon Cloud: Mobile Blood Donation Registration Service The Mobile Blood Donation Registration Service (MBDRS) is part of a cloud computing technology designed to support blood transfusion services. The system is built on the mobile model of services oriented architecture (SOA) and XML security technologies.

[email protected]

www.hrl.uoit.ca/~ckphung/

BPM Repository: Bring People Together, Propel Business Forward This exhibit will show integration of BPM tools such as WebSphere Business Modeler, WebSphere Integration Developer and WebSphere Business Monitor on Rational Asset Manager (RAM). The BPM repository promotes the reuse of assets while supporting governance policies of an organization. Attendees will gain valuable insight into an SOA management solution.

[email protected]

Business Rules Recovery from Legacy Applications The understanding and discovery of business rules play a major role in the maintenance and modernization of legacy software systems. According to a recent survey, about half of the companies who reported difficulties in modernizing their legacy systems said that "a major issue was the fact that hard-coded and closed business rules" make it difficult to adapt their systems to new requirements and migrate to more modern environments.

[email protected]

Enabling Best Practices for IBM Global Services (GBS) A successful GBS initiative has been enabling practitioners who support multiple lines of business. We enhance practitioners' tool kits with modern practices for solution delivery. View a showcase of success stories and services which have been developed, including best practices for Agile methods, Project, Requirements, Architecture, Change and Configuration Management.

[email protected]

Green Transformation Workbench for Data Centers Green transformation is a key management initiative that attempts to align the people, process and technology of an enterprise more closely with its strategy and vision for green business. The Green Transformation Workbench is a practitioner's tool which implements a methodical approach devised to analyze green transformation opportunities and make business cases for transformation initiatives.

[email protected]

domino.research.ibm.com/comm/rese

HR in Virtual Worlds: A Real Corporate App for the 3D Internet? In this exhibit, we will present and discuss results of studies on speed mentoring, where, a la speed dating, mentors provide advice to rotating groups of mentees. The speed mentoring occurred both face-to-face and also in Second Life. This exhibit will analyze one very promising corporate use of Virtual Worlds.

[email protected]

www.yorku.ca/hmkim

Incremental Change Propagation across Web Services Artifacts A novel methodology to incrementally synchronize generated software artifacts with their sources and vice versa, without the need to re-engineer the transformation into an incremental one. We utilized our proposed technique in Eclipse WTP for efficient synchronization of Java source code with web services artifacts in both directions. [email protected]

Intelligent Service Selection and Composition Service composition selects and weaves services to achieve a goal under the guidance of business processes. Existing service composition methods are inefficient and ineffective due to the complexity of workflows and the large amount of services available. In this work, we demonstrate an approach for intelligently searching and invoking services to achieve a user's goal.

[email protected]

OpenOME - A Tool for Early Requirements Modeling and Analysis OpenOME is a GMF-based requirements-modeling environment that supports several requirements-modeling notations, requirements management, and model analysis. The tool demonstrates some potential uses of the Eclipse GMF framework, as well as ongoing research with early requirements analysis.

[email protected]

se.cs.toronto.edu/trac/ome

Software Defect Redisoveries: Causes and Their Significance Software defect rediscoveries account for 50-90% of the total failures of a software product. We believe it would be a great initiative for software providers to understand the phenomenon of rediscoveries to ultimately reduce the cost due to rediscoveries and thus the cost, by a significant degree, of software product maintenance.

[email protected]

Back to top

User Technologies 3-D Internet: Capturing Visitor Insights with Best Practices in Virtual Worlds This demonstration will provide an overview of the ibm.com 3D Internet, the techniques used to capture visitor insights and the strengths, weaknesses, and opportunities for this virtual channel. Viewers will be able to apply the key points to their work in virtual or real worlds.

[email protected]

A Business-Process-Driven Approach for Generating User Interfaces User interfaces (UIs) of business applications and business processes rarely evolve consistently due to fast market changes. Moreover, studies show that the majority of the UIs of business applications suffer from usability problems. In this work, we demonstrate an approach for automatically generating UIs with satisfactory usability from business processes.

[email protected]

CnP: Supporting Copy-and-Paste Programming in Modern IDEs The CnP project, short for 'Copy and Paste', aims at providing tool support for copy-and-paste programming. Three Eclipse plugins will be demonstrated (1) CReN: Consistent ReNaming of identifiers in code clones, (2) CSeR: Support for Class SEgment Reuse via copy-paste, (3) Clone-Importer: Importing clone information from a clone detection tool.

[email protected]

www.clarkson.edu/~dhou

Designing for the Mobile Experience Mobile devices are now commonplace for business and technical end users. But what are the best ways to design for these diverse end users? How does the mobile experience link into their computer software easily and what are the differences from one mobile device to another? With the help of emulators and simulators, testing and prototyping, we can ensure a seamless user experience from desktop to mobile.

[email protected],

Development of Visual Interfaces for Mashups Visualization is a key element of information mashups - rapidly-developed applications that remix data and services. Limited resources and a lack of information visualization knowledge challenge the developers in choosing and configuring the best visualization for the available data. Providing guidance and visual interfaces addresses that challenge.

[email protected]

Drupal: A Platform for Effective Team Collaboration Looking to deploy a flexible collaborative solution that is part wiki, part blog and is fully customizable? Drupal is an open source content management solution that's deployed on a LAMP infrastructure. It is fully functional out of the box, and is customizable using community-contributed modules or through custom PHP code.

[email protected]

Model-Driven Content Connectors and Web Intelligence: Consider the Source A first-of-a-kind approach to linking of content types via meta-tags within unified modeling (UML) and applying this approach when building large-scale global web experiences for clients. Architects, designers, and developers working on transformational processes, modeling, and traditional and intelligent web experience will find this poster of interest. [email protected]

PUMP (Partner Usability Milestone Program) A new development-driven process that has been created for the Rational Application Developer product. The purpose is to get more feedback from customers early and often in the product cycle. This will engage the customer, while giving development teams more time to incorporate user feedback into the product at different stages.

[email protected]

radical.rtp.raleigh.ibm.com/rad

Tag Clouds for Semi-Structured Documents TagSync is a novel tool based on multiple tag clouds to explore semi-structured documents. Features include support query refinement and filtering, and summarization of search results. Two implementations will be exhibited; the first to explore medical publications, and the second to explore Jazz work items.

[email protected]

www.thechiselgroup.org/

The Way to the Holographic Desktop Printer New developments in 3D hardcopy have made possible bright, clear, full colour images. They may be tiled to form large murals. Future development of this technology will bring full-parallax deep-scene holographic images that are virtually indistinguishable from real world objects. However, this new generation of auto-stereoscopic printers will present several computational challenges, which will be discussed.

[email protected]

www.photoniximaging.com

Adaptive Context-Aware Social Networks This project introduces a generic context-aware development framework for analysis and visualization of mobile and spontaneous social networks. An embedded inference engine is capable of scoring and analyzing social nodes and provides decision support to users in the form of semantics-based social graphs. The same framework has been used to deploy largescale context-aware deployments for airports and railway stations in order to help passengers navigate their way during transit time in unfamiliar environments.

[email protected]

Back to top

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Tuesday, October 28 paper presentations

Session 1: Databases Registration is now closed. Richmond Ballroom A/B CASCON 2009 will take place on November 2-5. DBMS Workload Control Using Throttling: Experimental Insights 10:00am CASCON 2008 Highlights Best Paper Highlights and video on Wendy Powley and Pat Martin, Queen's University; Paul Bird, IBM Toronto Lab ITWorldCanada

Efficient XPath Query Processing More info 10:30am Read the highlights article P. Mark Pettovello and Farshad Fotouhi, Department of Computer Science, Wayne State University View (28kb) Autonomic Tuning Expert - A Framework for Best-Practice Oriented Autonomic Database Tuning Get Adobe® Reader® 11:00am David Wiese and Gennadi Rabinovitch, Friedrich-Schiller-University Jena; Michael Reichert and Stephan Arenswald, IBM Deutschland Research & Development GmbH

Session 2: Web Applications Richmond Ballroom C CASCON 2008 Videos Synchronized Tag Clouds for Exploring Semi-Structured Clinical Trial Data Watch full-length videos from 10:00am the conference Maria-Elena Hernandez, Sean M. Falconer and Margaret-Anne Storey, University of Victoria; Simona Carini and Ida Sim, University of California San Francisco Click here

Personalized Recommendation of Related Content Based on Automatic Metadata Extraction 10:30am CASCON Proceedings Andreas Nauerz, IBM Research and Development; Fedor Bakalov and Birgitta König-Ries, University of Jena; Martin Welsch, IBM CASCON Proceedings are Germany Research and Development available on the

Online Stroke Modeling for Handwriting Recognition ACM Digital Library 11:00am Oleg Golubitsky and Stephen M. Watt, University of Western Related information Session 3: Software Engineering I CASCON archives Richmond Ballroom D

Flexible Verification of User-Defined Semantic Constraints in Modelling Tools CASCON mailing list 10:00am Subscribe/Unsubscribe to Daniel Amyot and Jun Biao Yan, University of Ottawa CASCON mailing list

Towards a UML Virtual Machine: Implementing an Interpreter for UML 2 Actions and Activities 10:30am Michelle L. Crane and Juergen Dingel, Queen's University

An Empirical Study of the Design and Implementation of Object Equality in Java 11:00am Chandan R. Rupakheti and Daqing Hou, Clarkson University

Back to top

Wednesday, October 29 paper presentations Session 4: Systems I Richmond Ballroom A/B

Automating SLA Modeling 10:00am Tony Chau, Vinod Muthusamy, and Hans-Arno Jacobsen, University of Toronto; Elena Litani, Allen Chan, Phil Coulthard, IBM Canada Ltd.

Capacity Planning for Service-Oriented Architectures 10:30am Michael Smit, Andrew Nisbet, and Eleni Stroulia, University of Alberta; Andrew Edgar and Gabriel Iszlai, IBM Toronto Lab; Marin Litoiu, York University

A Reliability Estimation for Large Distributed Software Systems 11:00am Alberto Avritzer and Flávio P. Duarte, Siemens Corporate Research; Rosa Maria Meri Leão and Edmundo de Souza e Silva, Universidade Federal do Rio de Janeiro, Brazil; Michal Cohen and David Costello, Siemens Transportation Systems

Session 5: Software Engineering II Richmond Ballroom C

A Methodological Leg to Stand On: Discoveries Using Grounded Theory to Study Software Development 10:00am Steve Adolph, Wendy Hall, and Philippe Kruchten, University of British Columbia

A Taxonomy of Software Types to Facilitate Search and Evidence-Based Software Engineering 10:30am Andrew Forward and Timothy C. Lethbridge, University of Ottawa

Building Highly-Interactive, Data-Intensive, REST Applications: The Invenio Experience 11:00am Michelle Annett and Eleni Stroulia, University of Alberta

Session 6: Compilers Richmond: Ballroom D

OpenMP Tasks in IBM XL compilers 10:00am Xavier Teruel, Barcelona Supercomputing Center - Universitat Politècnica de Catalunya; Priya Unnikrishnan, IBM Toronto Laboratory; Xavier Martorell and Eduard Ayguadé, Barcelona Supercomputing Center - Universitat Politècnica de Catalunya; Raul Silvera, Guansong Zhang and Ettore Tiotto, IBM Toronto Laboratory

High Performance XML Parsing Using Parallel Bit Stream Technology 10:30am Robert D. Cameron, Kenneth S. Herdy, and Dan Lin, Simon Fraser University

Back to top

Thursday, October 30 paper presentations Session 7: Systems II Richmond Ballroom A/B

Information-Theoretic Modeling for Tracking the Health of Complex Software Systems 10:00am Miao Jiang, Mohammad A. Munawar, Thomas Reidemeister and Paul A. S. Ward, University of Waterloo

Using Economic Models to Allocate Resources in Database Management Systems 10:30am Mingyi Zhang, Patrick Martin, and Wendy Powley, Queen's University; Paul Bird, IBM Toronto Lab NetPal: A Dynamic Network Administration Knowledge Base 11:00am Ashley George, Adetokunbo Makanju, Evangelos Milios and Nur Zincir-Heywood, Dalhousie University; Markus Latzel and Sotirios Stergiopoulos, Palomino System Innovations Inc.

Session 8: Software Engineering III Richmond Ballroom C

SIFT: A Scalable Iterative-Unfolding Technique for Filtering Execution Traces 10:00am Best Student Paper A. V. Miranskyy and N. H. Madhavji, University of Western Ontario; M. S. Gittens, University of the West Indies; M. Davison, University of Western Ontario; M. Wilding, D. Godwin and C. A.Taylor, IBM Canada Ltd.

An Architecture for Providing Context in WS-BPEL Processes 10:30am Allen Ajit George and Paul A.S. Ward, University of Waterloo

Is it a Bug or Enhancement? A Text-Based Approach to Classify Change Requests 11:00am Giuliano Antoniol and Kamel Ayari, École Polytechnique de Montréal; Massimiliano Di Penta, University of Sannio; Foutse Khomh and Yann-Gaël Guéhéneuc, Université de Montréal

Back to top

Navigation and Resources

CASCON 2008 Resources Related Links CASCON Events Registration & Sign-in Hotel information IBM University Relations CASCONcamp demo camp Full Papers Sponsors Programming Contest High school competition CASCON 2008 Program Contacts Central Committee

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Monday, Oct 27 Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Monday, October 27 Workshops Morning Session - 9 a.m.

Challenges of Model-Driven Engineering Registration is now closed. Location: Richmond E CASCON 2009 will take place Co-Chairs: Jo Atlee, University of Waterloo; Tom Maibaum, McMaster University; Marin Litoiu, York University; Daniel Leroux, IBM on November 2-5. Canada Ltd. CASCON 2008 Highlights Model-driven engineering (MDE) promises dramatic improvements in software quality and developer productivity by using high- Highlights and video on level models of system designs and deriving other artifacts (such as analyses, code, and test suites) from the models. However, ITWorldCanada the current generation of MDE approaches and tools are brittle, poorly integrated, and will not scale to the level of complexity More info expected in the software-intensive systems of the near future. This workshop will focus on identifying some of the key open problems that, if solved, will help MDE to fulfill its promise. Attendees will also consider how to scope and prioritize these Read the highlights article problems. The latter discussions will be influenced by Canada's unique strengths in software engineering research, software View (28kb) development tools, and human factors, and will help to identify those problems where Canadian researchers will be most likely to make significant advances. Get Adobe® Reader®

Afternoon Session - 1 p.m. Hands-On: Project Zero Location: Markham A Co-Chairs: Bart Stanczyk and Kalvin Misquith, IBM Canada Ltd. CASCON 2008 Videos Project Zero is a community-driven commercially developed incubation project from IBM that allows simple, quick, and agile development of Web applications. A commercial and stable version of Project Zero is WebSphere sMash. This workshop will focus Watch full-length videos from on a hands-on approach to developing a simple Web application using the Web-based application builder, AppBuilder. The tutorial the conference will cover a number of sMash features including authentication, data store, flow, and the drag-and-drop Dojo front end. By using a Click here step-up approach to introduce each feature, the workshop will give participants practical and thorough knowledge about developing Web applications quickly and effectively. CASCON Proceedings Slides and exercise booklet (1.15mb) CASCON Proceedings are ProjectZeroWorkshopResources.zip (0.05mb) available on the

Architecture for Web Applications ACM Digital Library Location: Richmond D Co-Chairs: Yelena Yesha, University of Maryland; Joanna Ng, IBM Canada Ltd.; Weidong Kou, IBM China; Chris Mitchell, IBM Corp. Related information

In recent years there has been tremendous growth in the demand for, and the use and development of, Web applications. This CASCON archives workshop will provide an international forum for developers, scientists, and engineers for addressing research results and discussing their experiences and sharing ideas pertaining to the practice and theory of architecting Web applications. The CASCON mailing list workshop will focus on the following topics: architecture patterns for Web applications; architecture tools for Web application design; bridging conceptual design and implementation; development methodologies and domain-specific languages; integration Subscribe/Unsubscribe to CASCON mailing list of context sources and context models; integration of legacy applications; semantic Web technologies; usability; security and privacy for data interchange; and evaluation metrics and criteria. Cell BE and heterogeneous multicore systems: architectures and applications Location: Richmond A/B Co-Chairs: Robert Enenkel, IBM Canada Ltd.; Christopher Anand, McMaster University; James Green, Carleton University Speakers: Michael McCool , RapidMind, Inc. and University of Waterloo; Kevin Browne, McMaster University

The Cell BE architecture is an innovative heterogeneous multicore single instruction, multiple data (SIMD) processor that addresses physical power and frequency limitations through the inclusion of multiple processor cores in a single package. Alternative multicore technologies include field-programmable gate arrays (FPGAs), where the processing architecture is often customized to the application, and graphics processing units (GPUs), which are currently promising up to 1 TFLOP of processing power per board. What is the difference between these approaches to high-performance computing, and which types of applications are best suited to each technology? Looking at implementation experiences from different application areas, and related theoretical topics, this workshop will provide attendees insight into these issues.

Full Day Session - 9.a.m. Hands-On: Service Oriented Architecture (SOA) in a Day Location: Aurora Chair: Gary Bist, IBM Canada Ltd.

This workshop will start with a definition of SOA, and then provide a one-day experience working with IBM's key SOA development environment, WebSphere Integration Developer. CASCON participants will develop and test an SOA application, which is one that is a composite of other applications. Standard SOA elements such as modules, components, business objects, interfaces, business rules, human tasks, imports, exports, data maps, and business processes will be covered. The tools will generate code compliant with SOA standards such as Web Services Description Language (WSDL) and Business Process Execution Language (BPEL), which are XML-based and platform- and language-independent. These SOA standards will also be discussed in the workshop.

Hands-On: Introduction to Ajax Technologies Location: Vaughn East Co-Chairs: Jen Hawkins, Jeffrey Liu, Lawrence Mandel, and Aron Wallaker, IBM Canada Ltd.

Asynchronous JavaScript and XML (Ajax) programming is currently at the forefront of rich Internet application (RIA) technology and is commonly cited as a major driver of Web 2.0. The rapid growth in the popularity of Ajax has highlighted the need for toolkits to support common functionality and speed development. This workshop will introduce the technologies behind Ajax from the basics of handling asynchronous requests through to an examination of some of the popular Ajax toolkits now available. Following a general introduction to the topics, this workshop will focus on the ibm-backed toolkit, Dojo, an open source DHTML toolkit written in JavaScript that assists in the construction of dynamic Web pages. Through a series of labs, participants will receive hands-on experience authoring a RIA with Dojo

Issues and Topics on SOA Programming Models Location: Richmond C Co-Chairs: Kostas Kontogiannis, University of Waterloo; Chris Brealey, IBM Canada Ltd.

This workshop will identify critical SOA research challenges that need to be addressed by the research community for SOA to fulfill its promise. The workshop will present a taxonomy of SOA research issues that will be used to frame the rest of the discussion. The workshop will focus on research needs that are currently causing the greatest pain for SOA practitioners. Topics will include "hard problems", tooling issues, governance challenges, monitoring through the life cycle, and the longer-term evolution of SOA. The workshop will include presentations by practitioners and the research community in addressing critical unmet issues.

Download PowerPoint© slides (3.22mb)

Workshop on Cybersecurity: Research and Use Cases in Internet Applications Security and Privacy Location: Vaughn West Co-Chairs: Miguel Vargas Martin, Patrick Hung, and Shahram Heydari, University of Ontario Institute of Technology; Marsha Chechik and David Lie, University of Toronto; Walid Rjaibi, IBM Canada Ltd.; Jacob Slonim, Dalhousie University Speakers:Amin Ibrahim, Farzad Kanani, Khalil El-Khatib, and Xiaodong Lin, University of Ontario Institute of Technology; Jacob Slonim, Dalhousie University; Tao Wan, Cactus Commerce Inc.; Peter Mason, Defence Research & Development Canada

As computers, especially networked computer systems, permeate our lives, security and privacy will be critical to the development and effective usage of those systems, thus becoming a crucial aspect of information technology. There is an emerging interest among Canadian research community in various aspects of cybersecurity. However, it is a broad topic that combines the facets of operating systems, data networks, programming languages, compilers, algorithms, cryptography, databases, computer architecture, formal methods, economics, and law, to name a few. This workshop will focus on the issues related to security and privacy in Internet applications.

Back to top

Tuesday, October 28 Workshops Afternoon Session - 1 p.m. Hands-On: The Cell BE application programming experience! Location: Aurora Co-Chairs: Robert Enenkel, IBM Canada Ltd.; Christopher Anand, McMaster University; James Green, Carleton University

Attendees will get actual hands-on experience implementing a fun physics simulation on the Cell BE with three different programming tools! Starting with an overview of the Cell BE heterogeneous multiprocessor computer, the developers of RapidMind and Coconut will then demonstrate their Cell software development tools, as well as the Cell BE SDK. Next will be an interesting physical simulation problem (interacting Nerf balls), and implementing a solution with each of the tools. Participants will then experiment and further improve the solution themselves in a tutorial setting. This workshop is ideal for those interested in high-performance application programming on Cell BE and who want to learn about powerful productivity-enhancing programming tools that are available for the platform.

Hands-On: Web 2.0 meets SOA with Apache Tuscany SCA Location: Vaughn East Chair: Luciano Resende and Haleh Mahbod, IBM Corp.

An essential characteristic of SOA is the ability to isolate business functionality into reusable services. However, in a world where businesses need to look beyond their immediate departments, the technology choices that need to work with one another creates a complex problem. This can easily lead to brittle applications that mix business logic with code to handle infrastructure differences. A higher level of abstraction is needed to enable SOA solutions to be flexible and adoptable to change at a lower development cost. This tutorial will explore how to use SCA and Tuscany to build a Web 2.0 store application that consumes multiple distributed SCA components via Web Services and JSON-RPC and Atom. A real-world scenario will be used to help demonstrate how to build and evolve the store application using Tuscany SCA as the business grows over time.

Web 2.0 meets SOA with Apache Tuscany SCA part 1 PDF (1.45mb) Web 2.0 meets SOA with Apache Tuscany SCA part 2 PDF(0.36mb)

Hands-On: Introduction to Web Services Development with Rational Application Developer 7.5 Location: Markham A Co-Chairs: Yen Lu, Rakan Khalid, Zina Mostafia, and Gilbert Andrews, IBM Canada Ltd.

Rational Application Developer 7.5 is equipped with the newest release of IBM's Web Services suite. New standards were introduced in conjunction with improved productivity aids. This workshop will present an in-depth look at the new Web services tools and runtime environment. New standards such as Web Services for Java EE (JSR-109 1.2), Java API for XML Web Services (JAX-WS 2.1) and WS-I Reliable Secure Profile (RSP) will be emphasized from the point of view of their business value to an organization. Hands-on exercises will be performed by attendees to reinforce the presented material.

Business Process Management in a Service-Oriented World Location: Richmond A/B Co-Chairs: Vinod Muthusamy, University of Toronto; Phil Coulthard, Allen Chan, Elena Litani, and Tony Chau, IBM Canada Ltd. Speakers: Pablo Irassar, Dorian Birsan, Suzette Samoojh, and Polina Gohshtein; IBM Canada Ltd.

Business Process Management (BPM) is a discipline that combines software capabilities and business expertise to manage and, if necessary, improve business processes. This workshop will give an introduction to BPM and provide an overview of the key WebSphere BPM products. Attendees will learn why it is critically important to combine BPM with service-oriented architecture principles, enabling the composition of business processes from services and exposing them in turn as business services. Discussion will also include ongoing research in the BPM field, including ways in which Service Level Agreements (SLA) can be exploited to simplify various BPM tasks.

Data Stream Management Location: Richmond D Co-Chairs: Pat Martin, Queen's University; Calisto Zuzarte, IBM Canada Ltd. Speakers: Qiang Zhu, University of Michigan; Yingying Tao, University of Waterloo; Nick Koudas, University of Toronto

Stream data management systems support a new class of applications, including sensor data, Internet traffic monitoring and financial tickers, in which the data occurs in the form of a continuous stream of items. The workshop will focus on the new challenges faced by these systems, such as the need to support continuous queries or queries over streams and historical data. Invited speakers, including leading researchers and industry practitioners, will present advances on these and related topics.

DataStreamManagement_1.pdf (0.26mb) DataStreamManagement_1.pdf (2.86mb) DataStreamManagement_1.pdf (0.33mb) DataStreamManagement_1.pdf (0.75mb)

Research Progress in Service Science, Management, and Engineering Location: York B Co-Chairs: Kelly Lyons, University of Toronto; Eleni Stroulia, and Paul Messinger, University of Alberta; Stephen Perelgut, IBM Canada Ltd. Speakers: Paul Sorenson, University of Alberta; Sacha Chua, IBM Canada Ltd.

This workshop will present research in the area of service science, management, and engineering with a goal of finding connections and interaction opportunities. Speakers will present research results in the broad area of service science, management, and engineering with a particular focus on virtual worlds and emerging service business models.

Best Practices for Developing High-Performance Java™ Applications: A Java Virtual Machine Perspective Location: Thornhill Co-Chairs: Daryl Maier and Vijay Sundaresan, IBM Canada Ltd. Speakers: Nikola Grcevski, Ryan Sciampacone, and Patrick Doyle

Developing scalable, high-performance Java code is an important challenge for all Java developers. This workshop will offer a unique perspective on Java performance tuning from the point of view of leveraging the underlying IBM J9 Java virtual machine (JVM). The recommendations presented will be based on experience gained in analyzing and tuning large Java applications by IBM J9 JVM developers with a deep understanding of the underlying runtime technology. The JVM developers will present techniques for identifying common application performance problems, describe Java programming practices that facilitate and hinder runtime optimizations, provide insight into how and when garbage collection tuning is necessary, and present techniques to allow an application to scale.

Part 1 (0.37mb) Part 2 (4.89mb) Part 3 (0.20mb) Part 4 (0.12mb)

End User Development of Enterprise Mashups Location: Richmond C Co-Chairs: Lars Grammel and Margaret-Anne Storey, University of Victoria; Leho Nigul, IBM Canada Ltd.

Applying mashups in an enterprise context is a unique chance to exploit the power of service-oriented architectures for business users. The ability to modify and create flexible, task-specific mashups enables users to evolve parts of the IT infrastructure to provide the information that will help increase business value. This workshop will address the following topics: How are enterprise mashups used currently? How are they supported by current tools? What research on mashups and end user development is available? What recommendations can be given based on this research? What are the technical aspects of enterprise mashups?

Fifth International Workshop on Engineering Autonomic Software Systems - Day 1 Location: Vaughn West Co-Chairs: Paul Ward, University of Waterloo; Marin Litoiu, York University

The goal of this workshop is to bring together researchers and practitioners who investigate concepts, methods, techniques, and tools to engineer autonomic software systems. Autonomic computing aims to reduce the ever-increasing complexity of managing software/system components. This workshop will focus on issues critical for the proliferation of autonomic applications. Selected topics will include requirements engineering for autonomic software, analysis and evolution of autonomic architectures and systems, systems' monitoring, performance modeling, and optimization.

An Innovative Solution for Automated Software Testing Location: Richmond E Co-Chairs: Sudarsha Wijenayake, IBM Canada Ltd.

The complexity and scale of the current software products are rapidly increasing. The resources and manpower that are dedicated for testing software are, however, not increasing and in some cases are being reduced. Automated testing has become crucial in the software industry to address this gap and therefore has been adopted more widely than ever before. This workshop will discuss an innovative design for developing automated test frameworks by overcoming traditional challenges in the particular field. Furthermore, this workshop will present strategies for automating all types of tests such as artifact-level tests, GUI tests, and API- level tests.

Back to top

Wednesday, October 29 Workshops Afternoon Session - 1 p.m. Hands-On: Real-time data integration through change data capture Location: Aurora Co-Chairs: Kris Kobylinski, Dan Snoddy, and Mike Jory, IBM Canada Ltd.

In this workshop, participants will learn about real-time data integration - how it works and how customers are using this technology to solve business problems. IBM InfoSphere™ Change Data Capture (from IBM's acquisition of DataMirror in 2007) is now part of the IBM replication family and provides scalable, high performance, heterogeneous data movement with minimal impact to source systems. This workshop will provide a basic explanation of how the technology works, and lay the groundwork for an appreciation of the wide scope of business problems that can be solved. There will also be an opportunity for hands-on experience with the product including configuring it for various scenarios.

Hands-On: Hacking Web Applications 101 Location: Stouffville Chair: Daniel Cappon, IBM Canada Ltd.

Many security individuals believe that firewalls and intrusion detection systems are sufficient to protect their most sensitive data from attack. This workshop will focus on the actual attacks against software that have been seen across the Internet, and the current techniques being exploited. Attendees will receiv hands-on experience with several aspects of Web application security, from reconnaissance and profiling, to using a vulnerability to exploit. This presentation will end with a focus on how to address these software issues and prevent such problems from continuing to occur.

Software Engineering for Science Location: Richmond E Co-Chairs: Janice Singer, and Mark Vigder, National Research Council Canada; Greg Wilson and Steve Easterbrook, University of Toronto

Increasingly, scientific discovery is linked to utilization of software resources, including high-performance computing, modeling, collaboration, and beyond. However, little research in software engineering or its related disciplines has focused on the specific needs of scientists, which can be quite different from a more general end-user audience. In this workshop, participants will explore the software engineering needs of scientists and propose both processes as well as additional research to meet those needs. By working together, software engineers, software engineering researchers and scientists are in a unique position to collaborate in defining how science is conducted and how discoveries will be made in the future.

Business Event Processing Location: Richmond D Co-Chairs: Hans-Arno Jacobsen, University of Toronto; Opher Etzion, IBM Research, Haifa Speakers: Chris Ferris and J.J. Jeng, IBM Corp.

Business event processing (BEP) is an emerging discipline that aims to exploit the event-based nature of many applications and software systems with the imperative nature of these systems to offer synergetic benefits. Benefits are real-time access to critical performance indicators, improved root cause understanding, and more timely delivery of business information. BEP develops paradigms, concepts, and techniques to complement existing approaches and develop new products. This workshop will give an introduction to BEP, provide an overview of event processing foundations, survey current WebSphere BEP products and strategy, and raise challenges for research and market. Attendees will learn why it is critical to combine BEP with service-oriented architecture principles, when to resort to BEP, and the open challenges that exist.

Exploring the full capabilities of Project Zero (IBM WebSphere sMash) Location: Thornhill Co-Chairs: Todd Kaplinger and Brandon Smith, IBM Corp.

Developing Web applications for the enterprise is one of the key strengths of the WebSphere portfolio, but the learning curve for JEE applications is too steep for a certain class of developers. This workshop will focus on how Project Zero can be used to build Java and PHP-based applications using a lightweight application-centric development platform without the steep learning curve.

Fourth Workshop on Challenges for Parallel Computing Location: Richmond A/B Co-Chairs: Priya Unnikrishnan, Kit Barton, and Guansong Zhang, IBM Canada Ltd. Speakers: Xavier Martorell, Technical University of Catalunya (UPC); Jim Xia and Michael Wong, IBM Canada Ltd.; Yunlian Jiang, The College of William and Mary; Cheng Ding, University of Rochester

Parallel computing has evolved significantly over the past few years and is now being used in many non-traditional environments. This workshop will explore some of the current challenges facing parallel computing. Members of the parallel computing user community will present problems they have found in new parallel computing systems. Representatives from research and industry will also discuss the challenges they have identified and the work they are pursuing to address these challenges.

Requirements-Driven Business Process Modelling and Performance Management Location: Richmond C Co-Chairs: Daniel Amyot and Liam Peyton, University of Ottawa; Alireza Pourshahid, Cognos, an IBM company; Eric Yu, University of Toronto Speakers: Alexei Lapouchnian, University of Toronto

Business processes and their management have always introduced difficult challenges for organizations. This workshop will present how recent notations and tools developed in the requirements engineering community can address these challenges. Languages such as i* and ITU-T's User Requirements Notation, supported by Eclipse-based editors such as jUCMNav and OpenOME, can help model, analyze, configure, and adapt business goals and processes/workflows. In addition, it will be shown how integration between these tools and IBM Cognos® 8 (business intelligence tool) and DOORS (requirements management) can help monitor and align processes and goals, as well as manage compliance with legislation, even as processes and laws evolve.

Intro (0.10mb) Danial Amyot (0.72mb) Alireza Pourshahid (2.83mb) Eric Yu, Jennifer Harkoff, Reza Samavi (1.90mb) Liam Peyton (0.85mb) Alexei Lapouchnian (1.03mb)

Fifth International Workshop on Engineering Autonomic Software Systems - Day 2 Location: Vaughn West Co-Chairs: Paul Ward, University of Waterloo; Marin Litoiu, York University

The goal of this workshop is to bring together researchers and practitioners who investigate concepts, methods, techniques, and tools to engineer autonomic software systems. Autonomic computing aims to reduce the ever-increasing complexity of managing software/system components. This workshop will focus on issues critical for the proliferation of autonomic applications. Selected topics include requirements engineering for autonomic software, analysis and evolution of autonomic architectures and systems, systems' monitoring, performance modeling and optimization.

Women in Technology: Attract, Retain, Excel Location: York B Co-Chairs: Joanna Ng, Karen Hunt and Kelly Ryan, IBM Canada Ltd.

This workshop covers three major themes of importance to women working in or studying computer science and IT: Attract: Why does the IT industry need more women, and how can we attract them to the field? Retain: What fosters fulfillment in females who pursue a career in IT? Excel: How can women in IT reach their full potential?

Through breakout discussion groups about each theme and keynote speeches, attendees will discuss and learn about the challenges posed to women in IT and avenues for success. The breakout groups' discussions will culminate in a Best Practices report. By bringing together women from various career stages in industry and academia, the workshop will provide a great networking opportunity for attendees. On the day of the workshop, you will have an opportunity to decide which breakout group - Attract, Retain, or Excel - you would like to join. Posters including each breakout group's discussion questions will be posted near the York B room on the day of the workshop to help you decide.

Full Day Session - 10 a.m. Hands-On: CASCON High School Programming Competition 2008 & Teachers' Workshops Location: Vaughn East Co-Chairs: Brenda Chow, IBM Canada Ltd; Lauren Gordon, Queen's University Speakers: Jennifer Schachter, Deirdre Athaide, and Tim DeBoer, IBM Canada Ltd.; Lisa Rubini, Toronto District School Board

The CASCON High School Programming Competition is a Java-based challenge that encourages high school students to discover the fun in computer science and information technology. Now in its fourth year, the CASCON High School Programming Competition attracts many schools from across the Greater Toronto region. In addition to the competition, two workshops will be held concurrently for high school computer science educators. Attendees will develop and implement interesting exercises that they can take with them for classroom use.

Back to top

Thursday, October 30 Workshops

Afternoon Session - 1 p.m.

Hands-On: Business and IT Collaboration using WebSphere Business Modeler & WebSphere Integration Developer Location: Markham A Co-Chairs: Rick Goldberg and Diana Lau, IBM Canada Ltd.

An overarching desire in the industry today is for the business to take back control of their projects from the IT teams. One powerful methodology for achieving this can be the use of business process management (BPM) techniques to foster better collaboration between these two groups. This workshop will introduce WebSphere Integration Developer and WebSphere Business Modeler to demonstrate how they can be used together for a business analyst and developer to jointly create a business-level application.

Hands-On: Service Component Architecture - A Simpler Architecture for SOAs Location: Markham A Chair: Doug Tidwell, IBM Corp.

SCA brings a modern approach to building composite applications by adding dependency injection to the development process. By moving more details from the application to the middleware, changes to the infrastructure do not involve the code of the application at all. This workshop will start with a discussion of SCA's key principles, and then illustrate them with a series of hands-on exercises.

Hands-On: Developing FLEX applications Location: Vaughn East Co-Chairs: Mihnea Galeteanu, Polina Gohshtein, and Irum Godil, IBM Canada Ltd.

Flex is becoming more and more mainstream when it comes to developing applications for the Web and the desktop. This workshop will introduce the audience to Flex as a means to achieve improved , agility, and reuse of desktop development skills on the Web.

Hands-On: Rapid web development with Grails and Groovy Location: Aurora Chair: Aron Wallaker, IBM Canada Ltd.

Grails is a Web development framework that builds on existing Java Web and database technologies and the Groovy scripting language to create a rapid development platform for Web applications. Following a general introduction to Groovy and Grails, the workshop will focus on using Grails to rapidly develop Web applications with database functionality. Through a series of labs, participants will receive hand-on experience to create a Web application with Groovy and Grails.

Technology Curriculum for the Information Society Location: Richmond E Co-Chairs: Kelly Lyons, and Eric Yu, University of Toronto; Ross McKegney, Fadow; Ernst Grundke, Dalhousie University; Barry Lunt, Brigham Young University Speakers: Richard McDonald, IBM Canada Ltd.; Jennifer Laidlaw, Ontario Public Service; Radu Campeanu, York University

Many university programs are emerging that aim to bring balance and perspective from a broader social context to technology education for knowledge workers in the information society. Through a panel discussion of educators, employers, and students, this workshop will present and critique curricula at various stages of implementation from different institutions and academic faculties. Participants will receive descriptions of a variety of curricula prior to the workshop. At the workshop, brief presentations will be given outlining each of the curricula, then a panel of educators, students, and employers will provide feedback and critique. Participants will also engage in the discussion.

An Open Roundtable on "Technopreneurship" - The new corporate skills imperative or Computer science is dead. Long live computer science. Location: York C Co-Chairs: Ray Cao, Impact Canada & GEW Canada; Chris Paterson, IBM Canada

The Technology enterprise of the day - hardware or software, large or small - and going forward will produce ideas in the form of value-added services. Everywhere, multidisciplanary technopreneurs are needed but nowhere are they developed. A transformation is required in terms of how we approach research, education and corporate training. In effect, for technopreneurs to flourish, our traditional approach to research and education that focuses on disciplinary expertise must decline - including computer science. We want know what you . Join senior representatives from SSHRC, the Ontario government, UW, IBM and others in an open roundtable. Help define the technopreneurship skills challenge for further debate by leaders in government, research, education and business.

SOA Research Challenges: Current Progress and Future Challenges Location: Richmond D Co-Chairs: Dennis Smith and Grace Lewis, Software Engineering Institute; Kostas Kontogiannis, University of Waterloo; Marin Litoiu, York University Speakers: David Ing and Chris Brealey, IBM Canada Ltd.; Hausi Muller, University of Victoria; Scott Tilley, Florida Institute of Technology

This workshop will identify critical SOA research challenges that need to be addressed by the research community for SOA to fulfill its promise. The workshop will present a taxonomy of SOA research issues that will be used to frame the rest of the discussion. The workshop will focus on research needs that are currently causing the greatest pain for SOA practitioners. Topics will include "hard problems", tooling issues, governance challenges, monitoring through the life cycle, and the longer-term evolution of SOA. The workshop will include presentations by practitioners and the research community in addressing critical unmet issues.

Download PowerPoint© slides (3.05mb)

User interfaces for visualizing complex data Location: Richmond A/B Co-Chairs: Igor Jurisica, Ontario Cancer Institute; Michael McGuffin, ETS Montreal Speakers: Gord Davison, IBM Canada Ltd.; Douglas J. Moseley, Princess Margaret Hospital; Thomas Kapler, Oculus Info Inc; Kevin Brown, D. Otasek, and A. Muhammad, Ontario Cancer Institute

This workshop will provide glimpses into current practices and state-of-the-art developments in user interfaces for visualizing, analyzing, and interpreting complex data. Reviewed will be the challenges in specific application domains, as well as available graphical user interface approaches, with the goal of identifying directions for future research and applications. Novel and special strategies for visualization, such as alternate representations, multiple views, use of color, layers, animation, zooming, and interaction techniques will be discussed. Applications areas will include 2D and 3D visualization, business intelligence, databases, bioinformatics, biomedical visualization, the Web, collaborative visualization, and others.

Cloud Computing Location: Vaughn West Co-Chairs: Johnny Wong, University of Waterloo; Marin Litoiu, York University; Gabriel Iszlai, IBM Canada - Toronto Lab

This workshop will focus on several complementary areas of research that enable cloud computing: (a) Software as a Service (SaaS); (b) platform virtualization technologies; and (c) the business model for the cloud. Virtualization and SaaS hide the highly inflexible and technology-specific nature of the underlying computing technology, thereby greatly simplifying the tasks of software development and evolution as well as its provisioning and operation. However, to be successful, cloud computing should prove that is a sustainable and profitable business model.

Web based tools: Challenges and opportunities Location: Richmond C Co-Chairs: Leho Nigul, Joanna Ng, and Elena Litani, IBM Canada Ltd.

This workshop will give a brief introduction to rich Internet application (RIA) technology focusing on Web-based tools. Discussions will include technologies and reference architectures for designing and implementing Web-based tools, and, in particular, challenges and opportunities in modeling and application development tools will be highlighted.

Seventh Workshop on Compiler-Driven Performance Location: Thornhill Chair: Greg Steffan, University of Toronto Speakers: Clark Verbrugge, University of McGill; Kirk Kelsey and Ian Christopher, University of Rochester; Pramod Ramarao, Nikola Grcevski, Amy Wang, Kenneth Ma, Marius Pirvu, Ian Mcintosh, and Ivan Sham, IBM Canada Ltd.; Xipeng Shen, The College of William and Mary; Borys J. Bradel, University of Toronto; Christopher Kumar Anand, McMaster University

The compiler-driven performance workshop will consist of the presentation of reports on research progress at various academic and industrial sites across Canada and in the United States. Topics discussed in the workshop will include, but will not be limited to: innovative analysis, transformation, and optimization techniques; languages, compilers, and optimization techniques for multicore processors and other parallel architectures; compiling for streaming or heterogeneous hardware; dynamic compilation for high-performance and real-time environments; compilation techniques for reducing power; and tools and infrastructure for compiler research.

Back to top

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Monday, Oct 27 Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Registration opens at 8:00 am.

Time Scheduled Events / Activities

8:00am Registration is now closed. CASCON 2009 will take place on November 2-5.

CASCON 2008 Highlights 9:00am Highlights and video on ITWorldCanada

More info 10:00am 9:00 a.m. - 12:00 p.m. Workshops Read the highlights article See 'Workshops' section for details View (28kb) 11:00am Get Adobe® Reader®

12:00pm

CASCON 2008 Videos 1:00pm Watch full-length videos from the conference

2:00pm Click here

1:00 p.m. - 4:45 p.m. Workshops CASCON Proceedings 3:00pm See 'Workshops' section for details CASCON Proceedings are available on the

ACM Digital Library 4:00pm

Related information CASCON archives 5:00pm 5:00 p.m. - 7:00 p.m. CASCON 2008 Technology Showcase Reception CASCON mailing list Open to all CASCON participants 6:00pm Subscribe/Unsubscribe to Location: Grand York Ballroom A CASCON mailing list 7:00pm

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Location: Grand Richmond Ballroom

Joint Topic: Medical Research Breakthroughs - Enabled by Technology Current research into disease prevention and treatment involves innovative methods of investigation using tools, which are Registration is now closed. enabling faster and more accurate findings. In these two talks, followed by a question and answer period, two leading Canadian CASCON 2009 will take place scientists discuss their work. on November 2-5. CASCON 2008 Highlights Highlights and video on 8:30 a.m. - Keynote presentation: ITWorldCanada

Igor Jurisica More info Senior Scientist, Ontario Cancer Institute Topic: Toward an intelligent molecular medicine: Fusion of obtrusion, illusion, confusion and integrative Read the highlights article computational biology View (28kb)

Abstract Get Adobe® Reader® Despite the introduction of diverse and powerful chemotherapeutic agents over the past two decades, many cancers still carry devastating mortality rates. The accumulation of data from systematic high-throughput experiments has brought the potential to construct models of how biological systems work at the cell or whole organism level. How to integrate multiple information levels to achieve this task is not trivial, and Dr. Jurisica will discuss some of the possible approaches, as well as focus on the high resolution, interactive visualization of large networks of interacting proteins. CASCON 2008 Videos John MacDonald Scientific Director, Robarts Research Institute Watch full-length videos from Topic: Investigation into the cellular basis of neurological conditions such as stroke and Alzheimer's the conference disease Click here Abstract Strokes are characterized by a delayed loss of brain function that occurs days or even weeks after the initial event, likely because CASCON Proceedings of the delayed death of brain cells. Until recently, the entry of toxic levels of calcium (Ca2+) was accepted as the major trigger for delayed cell death, but all potential attempts to design a new therapeutic drug based on this information have failed. Dr. CASCON Proceedings are available on the MacDonald and his colleagues discovered an alternative mechanism whereby cell death is caused in stroke. ACM Digital Library In this talk, Dr. MacDonald will briefly discuss how Robarts laboratories use "off the shelf" software to acquire and analyze data, and how better tools may be able to help researchers find ways to prevent or predict the indicators of stroke. Related information CASCON archives 12:00 p.m. - Frontiers of Software Practice Speaker Joanna Ng CASCON mailing list Program Director, CAS Toronto, IBM Topic: CAS: From Research Program to Research Commercialization -- The Complete Cycle Subscribe/Unsubscribe to Location: York Ballroom C CASCON mailing list

Abstract Joanna Ng will discuss how CAS Toronto's collaborative research program will focus on five technology themes. An overview of the five technology themes and their integration will also be covered. As of this 2008, CAS has grown beyond its research program and has extended to include a model for research commercialization. Ng will the various benefits of this model.

5:00 p.m. - Frontiers of Software Practice Speaker John Ponzo Distinguished Engineer, Research Client Technologies, IBM T.J. Research Center Topic: The Extreme Makeover of the Web Experience and the Transformation of the Web Application Architecture to Enable it Location: York Ballroom B Abstract John Ponzo will share his insights into the major transformations that have occurred in the evolution of the web and its user experiences: how end users have more control in the interface, how web interactions are seamless across various context, how resulting content is more relevant and personalized, and more. Ponzo will also cover the programming models and architectures of the web applications that enable such "extreme makeovers" of the web experience.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS main page CASCON 2008 CASCON 2008

Hotel information Registration and Login Sign in & Submissions Sponsors

Contacts Registration and Login Full Papers The registration process has two steps: Already registered? Registration & Sign-in Step 1. Email address validation. High school competition Sign in below to access your workshop Step 2. Creating your account. schedule and account information.

Register now! Sign in Related links There is no charge to attend CASCON! IBM University Relations Programming Contest Central IBM alphaWorks Sign in to CASCON 2008 to access the workshop and exhibit submission site. IBM developerWorks DB2 for Academics WebSphere for Academics CASCON 2008 Submissions Workshops CASCON 2008 Workshops provide a forum to present, discuss, and debate issues, problems, ideas, emerging technologies, work-in-progress, or directions on topics of interest listed in the Call for Papers. Interdisciplinary workshops are particularly encouraged. The workshop format may include position papers, expert panels, hands-on exercises, and discussions. The Workshop Committee will review each workshop proposal. Acceptance will be based on an evaluation of the workshop's potential for generating useful results, the timeliness and expected interest in the topic, and the organizers' ability to lead a successful workshop. Workshop proceedings will be published on the CASCON website.

Exhibits We invite researchers and developers from IBM, universities, government agencies, and our industry partners to present their latest technological undertakings in CASCON's Technology Showcase, an interactive forum which allows researchers and developers to meet and interact in a friendly atmosphere. CASCON will provide to exhibitors, at no cost, demo booth, signage, and power.

Registration help If you encounter problems during the registration process, please email [email protected] with a description that includes any error message you received.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 Hotel information Hotel information Sponsors

Contacts CASCON 2008 venue Full Papers Sheraton Parkway Toronto North Hotel and Convention Centre Registration & Sign-in 600 Highway 7 East (at Highways 404 & 7) Richmond Hill, Ontario L4B 1B2 Canada High school competition Phone: (905) 881 2121 Or 1-800-668-0101

Related links Directions Room rates are $139 at the Sheraton Parkway Toronto North Hotel and Convention IBM University Relations Centre. Programming Contest Central Room rates are $109 at the Best Western Parkway Hotel which is attached to the Sheraton. IBM alphaWorks IBM developerWorks DB2 for Academics Other Accommodations WebSphere for Academics Hilton Suites Toronto/Markham Conference Center & Spa 8500 Warden Avenue, Markham, ON CA L6G 1A5 Phone: (905) 470-8500

Comfort Inn 8330 Woodbine Ave Phone: (905) 477-6077

Howard Johnson Toronto-Markham 555 Cochrane Dr (Woodbine south of Hwy 7) Phone: (905) 479-5000

Monte Carlo Inn 8900 Woodbine Ave Phone: (905) 513-8100

Hilton Garden Inn 30 Commerce Valley Drive East, Markham Phone: (905) 709-8008

For information about bus transit in Markham, visit http://www.yorkregiontransit.com/.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 Hotel information Sponsors Sponsors

Contacts CASCON Proceedings Full Papers CASCON Proceedings are available on the Registration & Sign-in

High school competition ACM Digital Library

IBM Centers for Advanced Studies http://ibm.com/ibm/cas Related information Related links CASCON archives IBM University Relations Programming Contest Central IBM alphaWorks CASCON mailing list IBM Toronto Software Lab IBM developerWorks Subscribe/Unsubscribe to http://www-03.ibm.com/software/ca/en/torontolab/ DB2 for Academics CASCON mailing list WebSphere for Academics

Ontario Centres of Excellence http://www.oce-ontario.org/default.aspx

In partnership with

National Research Council Canada http://www.nrc-cnrc.gc.ca/

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 Hotel information Contacts Sponsors

Contacts If you have any questions or require more information about CASCON 2008, please forward your CASCON Proceedings Full Papers inquiries to [email protected]. CASCON Proceedings are available on the Registration & Sign-in

High school competition ACM Digital Library

Related information Related links CASCON archives IBM University Relations Programming Contest Central IBM alphaWorks CASCON mailing list IBM developerWorks Subscribe/Unsubscribe to DB2 for Academics CASCON mailing list WebSphere for Academics

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration PCC Archive PCC 2007 Schedule PCC 2006 Download PCC 2005 Q&A Session

Teachers tech workshop Related Information CASCON Programming Competition 2007 Related links A rundown of last year's IBM University Relations competition, including winners Programming Contest Central and participating schools. IBM alphaWorks More info IBM developerWorks DB2 for Academics WebSphere for Academics Mailing list Subscribe/Unsubscribe to PCC mailing list

This year your Java challenge is "I Have a Code"! "I Have a Code" is a gaming simulator challenge, created by university students here at IBM. "I Have a Code" is loosely based on the trials and tribulations of cell colonies in the human body. Code super-cold cells to work in unison and form a colony that will grow, expand, split, specialize, and attack other cell colonies. You will be strategizing and programming in Java to compete for the highest score.

You will be competing in teams of two, with teams coming from local Ontario high school. The "I Have a Code" Challenge allows direct, real-time competition between teams. After three rounds of a round-robin tournament, the top winners will be determined during a tournament consisting of several rounds and eliminations.

Details are as follows:

Date: Wednesday, October 29, 2008 Place: Sheraton Parkway Toronto North Hotel and Convention Centre, 600 Highway 7 East (at Highways 404 & 7), Richmond Hill, Ontario Time: 8:30 am to 4:00 pm. What To Bring: One piece of printed resource material like a textbook. NO PRINTED OR WRITTEN PAPERS ARE ALLOWED. Breakfast and lunch will be provided at no cost.

As for "I Have a Code", there are a couple of things you will need to do to get it running on your computer.

Instructions I. Download a Java Runtime Environment (JRE). Version 6.0 or higher is recommended to run "I Have a Code" (if you see 1.6.0 on the "I Have a Code" site, don't worry, it's the same thing).

Get JRE 6.0

II. Download Eclipse. Version 3.4 or higher is recommended. Extract into an accessible folder (like on your desktop).

Get Eclipse IDE

III. Download "I Have a Code". The latest version. Installation instructions are on the site.

Get "I Have a Code"

IV. Read the Explanation. It's useful, so take advantage of it! It's quite long, but it really helps to clear up the game. (Right- click and choose 'Save Target As...')

Download manual

Prepare as much as possible beforehand. You will be permitted one book or other published resource but no other programs or written material of any kind will be permitted. There is a multi- part scoring system in place (found in the "I Have a Code" manual). When developing your strategy, make sure you take this into account!

If installed properly, a running game should look like this:

The Start of an "I Have a Code" Game

And as the game progresses...

The Middle of an "I Have a Code" Game

Number of cells, types of cells, and elimination of other cells contribute to your final score. Do what you will, but the colony with the highest score after a full day (or approximately a minute in real time), shall be declared the winner.

Your next steps A guide to information about registration. Competition Itinerary A schedule of what will happen on the competition day. Subscribe to e-mail notification Make sure you register to receive registration information and competition updates! CASCON Programming Competition 2008 The student's competition page. Pass this link on to any students interested in participating in the competition.

About IBM Privacy Contact Terms of use Highlights from CASCON 2008

The IBM Centers for Advanced Studies’ (CAS) CAS conference (CASCON) may have started out relatively small, but its growth has been remarkable. The first conference, held in 1991, was attended by 650 people. It featured 32 demonstrations and 26 paper presentations. Just two years later, the number of demonstrations at CASCON had grown to 70. By 2006, approximately 110 demos were presented to more than 1500 CASCON attendees.

Since its inception, the conference has focused on computer science and software engineering with the intent of providing academics and industry members the opportunity to explore new research and technology.

CASCON 2008, held from October 27-30 at the Sheraton Parkway Toronto North Hotel Suites & Conference Centre, was host to over 1200 attendees. It boasted an impressive portfolio of events, including 39 workshops, 23 technical paper presentations, keynote speakers from a variety of backgrounds, and 70 exhibits and posters in the Technology Showcase.

While the conference didn’t officially kick off until the morning of Tuesday, October 28, a number of popular workshops and the CASCON 2008 Technology Showcase Reception were held on Monday, October 27.

Tuesday’s opening remarks by the Head of CAS Toronto, Joanna Ng, set the tone for the rest of the conference. Following the Best Paper Awards presentation, the keynote presenters, Dr. Igor Jurisica and Dr. John MacDonald spoke about how innovative technology has enabled breakthroughs in cancer and stroke research respectively.

This was just the beginning of the exciting and informative events at CASCON. Highlights from the four day conference included:

• Thirty-nine workshops gave attendees an opportunity to learn about and discuss up-and- coming technologies. Of these workshops, thirteen were hands-on, allowing attendees to use hot technologies like Ajax. CASCON workshops tend to be a barometer for emerging technologies and topics. Twelve workshops were categorized as Enterprise Web 2.0, and eight were Service Oriented Architecture 2.0. • The Best Student Paper Award was given to former CAS student and now IBM employee Andriy Miranskyy. Co-authors were Nazim Madhavji and Matt Davison from University of Western Ontario, Mechelle Gittens from the University of the West Indies, and IBM developers Mark Wilding, Dave Godwin, and Colin Taylor for their work “SIFT: A Scalable Iterative-Unfolding Technique for Filtering Execution Traces.” • The Best Paper Award was granted to Wendy Powley and Pat Martin from Queen’s University, and Paul Bird of the IBM Canada Lab, Toronto for their paper “DBMS Workload Control Using Throttling: Experimental Insights.” • A Women in Technology luncheon and workshop focused on the themes “Attract,” “Retain,” and “Excel.” Keynote speakers were Dr. Pat Selinger, retired IBM Fellow and VP of Database technology and current advisory board member of Women in Technology International, and Dr. Margaret-Anne “Peggy” Storey, an associate professor at University of Victoria and a CAS Visiting Scientist. • Keynote and Frontiers of Software Practice speakers’ topics were diverse. In addition to Tuesday’s speeches from Dr. Jurisica and Dr. McDonald, Joanna Ng spoke about recent changes at CAS Toronto, including the five technology themes that are the focus of its collaborative research program. Tuesday evening, John Ponzo of the IBM T.J. Watson Research Center spoke about changes that have occurred in the web experience and the transformation of the web application architecture to enable them. • Wednesday morning, Google’s Alex Nicolaou was on hand to speak about the company’s web browser, Chrome. Wednesday afternoon, Brent Hailpern, also of IBM T.J. Watson Research Center, spoke about technology and social trends in software development. • Thursday, Paul Kedrosky of the Kauffman Foundation spoke about commercialization on a dollar a day. Later, Nagui Halim, also of IBM T.J. Watson Research Center, gave a speech about Data Streaming. • The Fourth Annual CASCON High School Programming Contest took place on Wednesday, October 29. Working with Tim Deboer, RAD Release Architect in the WebSphere Tools team, this year's Java/Eclipse-based challenge was designed by CAS Summer High School Internship students. Sixty-eight (Grade 10 - 12) students from twenty-four high schools across the GTA competed in the “I Have a Code" competition. While the student competitors worked on the programming challenge, their teachers participated in a full-day workshop. As in past years, this education session for the teachers provided new material and teaching materials for them to use in the classroom, and provided an opportunity for them to share ideas and insights on how to make computer science more interesting to students.

CASCON 2008 continued the conference’s tradition of showcasing new research and technology. It would not have been possible without the academics, IBM employees, and other industry members who submitted content for consideration. CAS would like to thank everyone who participated in and volunteered for CASCON.

Videos of the keynote speeches are currently available for viewing online. Workshop, speaker, and paper presentation abstracts are currently available at the CASCON 2008 website, and will later be archived at https://www.ibm.com/ibm/cas/archives/index.shtml.

Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 Hotel information CASCON 2008 Videos Sponsors

Contacts Tuesday, October 28th CASCON Proceedings Full Papers CASCON Proceedings are available on the Registration & Sign-in ACM Digital Library High school competition

Related information Related links CASCON archives IBM University Relations Programming Contest Central IBM alphaWorks CASCON mailing list IBM developerWorks Subscribe/Unsubscribe to DB2 for Academics CASCON mailing list WebSphere for Academics

Opening Remarks Joanna Ng & Christian Couturier

Best Paper Awards Dr. Mark Vigder

Keynotes John MacDonald & Igor Jurisica

Frontiers of Software Practice Joanna Ng, IBM Centers for Advanced Studies

Back to top

Wednesday, October 29th

'Pat Selinger PhD Fellowship' award presentation

Keynote Alex Nicolaou, Google Inc.

Frontiers of Software Practice Brent Hailpern, IBM T.J. Watson Research Center

Back to top

Thursday, October 30th

Special Presentation Levon Stepanian, IBM Canada Inc.

Keynote Paul Kedrosky, Kauffman Foundation

Back to top

About IBM Privacy Contact Terms of use

Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Tuesday, October 28 paper presentations

Registration is now closed. Session 1: Databases CASCON 2009 will take place DBMS Workload Control Using Throttling: Experimental Insights on November 2-5. Best Paper Wendy Powley and Pat Martin, Queen's University; Paul Bird, IBM Toronto Lab CASCON 2008 Highlights Highlights and video on Today's Database Management Systems (DBMSs) are required to handle diverse, mixed workloads and to provide differentiated ITWorldCanada levels of service to ensure that critical work takes priority. In order to meet these needs it is necessary for a DBMS to have control over the workload executing in the system. Lower priority workloads should be limited to allow higher priority workloads to More info complete in a timely fashion. In this paper we examine query throttling techniques as a method of workload control. In our Read the highlights article approach, a workload class may be slowed down during execution, thus releasing system resources that can be used by higher View (28kb) priority workloads. We examine two methods of throttling; constant throttling throughout query execution, and query interruption in which a query is paused for a period of time. A set of experiments using Postresql 8.1 provides insights regarding the performance Get Adobe® Reader® of these different throttling techniques under different workload conditions and how they compare to using operating system process priority control as a throttling mechanism.

Efficient XPath Query Processing P. Mark Pettovello and Farshad Fotouhi, Wayne State University

We propose improved XPath query processing algorithms for the MTree navigational XML database index on schema-less XML CASCON 2008 Videos documents. Our algorithms efficiently resolve element name specific XPath navigational queries without a need for sorting or for Watch full-length videos from qualified name filtering on intermediate sequences. The optimization methods are applicable for all axes but are presented for the the conference four major XPath axes: descendant, ancestor, following and preceding. Experimental results are included that show substantial performance improvements over other methods. Click here

Autonomic Tuning Expert - A Framework for Best-Practice Oriented Autonomic Database Tuning David Wiese and Gennadi Rabinovitch, Friedrich-Schiller-University Jena; Michael Reichert and Stephan Arenswald, IBM CASCON Proceedings Deutschland Research & Development GmbH CASCON Proceedings are available on the A prototype called Autonomic Tuning Expert (ATE) was developed, following the ideas of autonomic computing. It is a feedback control loop based infrastructure for automating typical tuning tasks requiring minimal human intervention. ATE enables DBAs to ACM Digital Library store, maintain, exchange, and adapt their best-practice tuning methods and correlate them with problem-indicating events. Related information Back to top CASCON archives Session 2: Web Applications Synchronized Tag Clouds for Exploring Semi-Structured Clinical Trial Data Maria-Elena Hernandez, Sean M. Falconer and Margaret-Anne Storey, University of Victoria; Simona Carini and Ida Sim, CASCON mailing list University of California San Francisco Subscribe/Unsubscribe to CASCON mailing list Searching and comparing information from semi-structured repositories is an important, but cognitively complex activity for internet users. The typical web interface displays a list of results as a textual list which is limited in helping the user compare or gain an overview of the results from a series of iterative queries. In this paper, we propose a new interactive, lightweight technique that uses multiple synchronized tag clouds to support iterative visual analysis and filtering of query results. Although tag clouds are frequently available in web interfaces, they are typically used for providing an overview of key terms in a set of results, but thus far have not been used for presenting semi-structured information to support iterative queries. We evaluated our proposed design in a user study that presents typical search and comparison scenarios to users trying to understand heterogeneous clinical trials from a leading repository of scientific information. The study gave us valuable insights regarding the challenges that semi- structured data collections pose, and indicated that our design may ease cognitively demanding browsing activities of semi- structured information.

Personalized Recommendation of Related Content Based on Automatic Metadata Extraction Andreas Nauerz, IBM Research and Development; Fedor Bakalov and Birgitta König-Ries, University of Jena; Martin Welsch, IBM Germany Research and Development

In order to efficiently use information, users often need access to additional background information. This additional information might be stored at various places, such as news websites, company directories, geographic information systems, etc. Oftentimes, in order to access these different pieces of information, the user has to launch new browser windows and direct them to appropriate resources. In our today's Web 2.0, the problem of accessing background information becomes even more prominent: Due to the large number of different users contributing, Web 2.0 sites grow quickly and, most often, in a more uncoordinated way regarding, e.g., structure and vocabulary used, than centrally controlled sites. In such an environment, finding relevant information can become a tedious task.

This paper proposes a framework allowing for automated, user-specific annotation of content which enables provisioning of related information. The approach is being implemented within IBM's WebSphere Portal.

Online Stroke Modeling for Handwriting Recognition Oleg Golubitsky and Stephen M. Watt, University of Western Ontario

The process of recognizing individual hand-written characters is one of classifying curves. Typically, handwriting recognition systems---even ''online'' systems---require entire characters be completed before recognition is attempted. This paper presents a better approach for real-time recognition: certain characteristics of a curve can be computed as the curve is being written, and these characteristics are used to classify the character in constant time when the pen is lifted. We adapt an earlier approach of representing curves in a functional basis and reduce real-time stroke modelling to the Hausdorff moment problem.

Back to top

Session 3: Software Engineering I Flexible Verification of User-Defined Semantic Constraints in Modelling Tools Daniel Amyot and Jun Biao Yan, University of Ottawa

Many modelling tools embed verification rules that are checked against user-defined models to ensure they satisfy the static semantic constraints of the modelling language. However, there are many other contexts where required constraints vary with the intended purpose of the model, and not just the modelling language used. In this paper, we propose a flexible and practical approach for users to define, select, store, group, exchange, enable, and verify custom semantic constraints expressed in the Object Constraint Language. We illustrate the benefits of this approach with exten-sions to an Eclipse-based modelling tool, called jUCMNav, and applications to various contexts such as style compliance, analysis, and transfor-mations that involve chains of tools. We believe this approach to be easily adaptable to other Eclipse-based modelling tools, which could then enjoy similar benefits.

Towards a UML Virtual Machine: Implementing an Interpreter for UML 2 Actions and Activities Michelle L. Crane and Juergen Dingel, Queen's University

An interpreter for UML 2 actions and activities is presented. It is based on two novel features in UML 2: the three-layer semantics architecture and the new token offer semantics for activities, which is intended to generalize the token flow semantics of Petri nets. The interpreter offers an array of analysis capabilities, ranging from random execution to reachability properties and assertion and deadlock checking. The design of the interpreter makes it suitable as the basis for a more comprehensive UML virtual machine.

An Empirical Study of the Design and Implementation of Object Equality in Java Chandan R.Rupakheti and Daqing Hou, Clarkson University

Applications which are built on top of an existing design and implementation are in general expected to collaborate well with the existing design and respect all of its intent. Failure in achieving this may result in buggy, fragile, and less maintainable code in the applications. When the dependence on an existing design becomes more wide-spread, this requirement on proper extension becomes even more critical. As an instance of this general problem, the object equivalence design in Java as well as its extensions is examined in detail and empirically. By examining how object equivalence is extended in a large amount of Java code, a set of typical problems associated with extending the object equivalence design are detected and their root causes analyzed. A set of design guidelines for object equivalence are proposed, which, if followed, will help programmers systematically design and evolve rather than hack on a solution. Examples are drawn from a case study of multiple industrial and open source projects to illustrate the identified problems and demonstrate how the proposed guidelines help solve these problems.

Back to top

Wednesday, October 29 paper presentations Session 4: Systems I Automating SLA Modeling Tony Chau, Vinod Muthusamy, and Hans-Arno Jacobsen, University of Toronto; Elena Litani, Allen Chan,Phil Coulthard, IBM Canada Ltd.

Service Level Agreements (SLA) define the level of service that a service provider must deliver. An SLA forms a contract between service provider and consumer, and includes appropriate actions to be taken upon violation of the contractual obligations. However, the process of implementing an SLA using the existing IT infrastructure is difficult, as it requires a lot of manual effort to translate an SLA into code, model it with the given programming language, and ensure the required monitoring support is available to ensure efficient monitoring and tracking of the SLAs.

In this paper, we present a solution for modeling an SLA contract. It is designed to be configurable, reusable, extensible and inheritable thus providing great flexibility to construct complex SLAs. We also introduce an algorithmic generation pattern to create the necessary artifacts to implement an SLA presented in this paper. The resulting artifacts automatically monitor a business process and evaluate whether the SLA is violated during runtime execution. The proposed approach is designed to require minimal human intervention.

Capacity Planning for Service-Oriented Architectures Michael Smit, Andrew Nisbet, and Eleni Stroulia, University of Alberta; Andrew Edgar and Gabriel Iszlai, IBM Toronto Lab; Marin Litoiu, York University

Service-oriented architectures (SOAs) are being increasingly adopted for the development of distributed applications that involve multiple partner organizations. The main challenge in configuring such applications - whether autonomously or manually - is meeting the service quality expected by the consumers.

In this paper, we describe a methodology and corresponding tool implementation for estimating the capacity of alternative configurations of complex service-oriented applications. We use a sophisticated enterprise application with many possible configurations as our test application. The current tool prototype simulates the behavior of the application for a given configuration on an existing network topology. This simulation is relatively coarse-grained, but is capable of tracking several performance indicators. We evaluate this simulation output against actual performance data.

A Reliability Estimation for Large Distributed Software Systems Alberto Avritzer and Flávio P. Duarte, Siemens Corporate Research; Rosa Maria Meri Leão and Edmundo de Souza e Silva, Universidade Federal do Rio de Janeiro, Brazil; Michal Cohen and David Costello, Siemens Transportation Systems

In this paper we present our experience to estimate the reliability of a large distributed system composed of several hundred points of presence for which we were required by contract to estimate its reliability. We present a simple approach that accurately approximates the reliability of this very large system.

Back to top

Session 5: Software Engineering II A Methodological Leg to Stand On: Discoveries Using Grounded Theory to Study Software Development Steve Adolph, Wendy Hall, and Philippe Kruchten, University of British Columbia

We are engaged in qualitative research projects to understand how people manage the process of software development. This study uses grounded theory as its method of inquiry and we have learned much about what is and what is not a grounded theory. We, like many researchers have claimed to follow grounded theory methods and even to produce a substantive theory. In reality, we often only borrow a few grounded theory practices to categorize our data. At best, making claims about methods which cannot be substantiated creates theories that can be challenged, and at worst we may be claiming theory status for what constitutes journalistic description illustrated by anecdotes. Using such findings as evidence can have serious consequences for our industry if the purpose of the research is to inform software development policy. Yet, if we believe agile is about people, then the grounded theory method offers us an opportunity to observe and understand the behaviour of people which affects our understanding of software engineering phenomena. This paper presents lessons learned about using grounded theory so that both researchers and reviewers can critically evaluate investigators' claims to be producing grounded theory as opposed to what is sometimes referred to as journalistic description.

A Taxonomy of Software Types to Facilitate Search and Evidence-Based Software Engineering Andrew Forward and Timothy C. Lethbridge, University of Ottawa

Empirical software research could be improved if there was a systematic way to identify the types of software for which empirical evidence applies. This is because results are unlikely to be globally applicable, but are more likely to apply only in certain contexts such as the type of software on which the evidence has been tested. We present a software taxonomy that should help researchers to apply their research systematically to particular types of software. The taxonomy was generated using existing partial taxonomies and input from survey participants. If a taxonomy such as ours gains acceptance, it will facilitate comparison and appropriate application of research. In the paper, we present the benefits of such a taxonomy, the process we used to develop it, and the taxonomy itself.

Building Highly-Interactive, Data-Intensive, REST Applications: The Invenio Experience Michelle Annett and Eleni Stroulia, University of Alberta

With the explosion of Web 2.0 ideas and tools, the importance of on-line collaboration and information sharing has never been more prevalent. Many companies have acknowledged this fact by adopting analytics tools to mine the on-line sentiment regarding their products and services and to effectively market these products and services. The music industry, in spite of the multitude of means for media sharing and buying online, does not appear to have gone far enough in exploiting the web for market analysis and communication. In this paper, we discuss Invenio, a geovisualization and trend-analysis REST-based Rich Internet Application for the music industry. Invenio utilizes a variety of different technologies (Yahoo! Maps, Amazon Associates Web Service, REST, and the Flex framework) to deliver a dynamic, innovative service and, in this paper, it is examined as a case study of the relevant technologies and a set of "good practices" for developing such applications.

Back to top

Session 6: Compilers OpenMP Tasks in IBM XL compilers Xavier Teruel, Barcelona Supercomputing Center - Universitat Politècnica de Catalunya; Priya Unnikrishnan, IBM Toronto Laboratory; Xavier Martorell and Eduard Ayguadé, Barcelona Supercomputing Center - Universitat Politècnica de Catalunya; Raul Silvera, Guansong Zhang and Ettore Tiotto, IBM Toronto Laboratory

Tasking is the most significant feature of the OpenMP 3.0 standard. It was introduced to handle unstructured parallelism and broaden the range of applications that can be parallelized by OpenMP. This paper presents the design and implementation of the task model in the IBM XL parallelizing compilers. The task construct is significantly different from other OpenMP constructs and we discuss here some of the unique challenges in implementing the task construct and its associated synchronication constructs. We also present a performance evaluation of our implementation on a set of benchmarks and applications. We identify limitations in the current implentation and propose solutions for further improvement.

High Performance XML Parsing Using Parallel Bit Stream Technology Robert D. Cameron, Kenneth S. Herdy, and Dan Lin, Simon Fraser University

Parabix (parallel bit streams for XML) is an open-source XML parser that employs the SIMD (single-instruction multiple-data) capabilities of modern-day commodity processors to deliver dramatic performance improvements over traditional byte-at-a-time parsing technology. Byte-oriented character data is first transformed to a set of 8 parallel bit streams, each stream comprising one bit per character code unit. Validation, transcoding and lexical item stream formation are all then carried out in parallel using bitwise logic and shifting operations. Byte-at-a-time scanning loops in the parser are replaced by bit scan loops that can advance by as many as 64 positions with a single instruction.

A performance study comparing parabix with other available C or C++ based parsers is carried out using the PAPI toolkit. Total CPU cycle counts as well as other important performance measures including level 2 cache misses and branch mispredictions are measured for various key components of parabix as well as for the other available parsers. Prospects for further performance improvements are also outlined, with a particular emphasis on leveraging the intraregister parallelism of SIMD processing to enable intrachip parallelism on multicore architectures.

Back to top

Thursday, October 30 paper presentations Session 7: Systems II Information-Theoretic Modeling for Tracking the Health of Complex Software Systems Miao Jiang, Mohammad A. Munawar, Thomas Reidemeister and Paul A. S. Ward, University of Waterloo

Stable correlation models are effective in detecting errors in complex software systems. However, most studies assume a specific mathematical form, typically linear, for the underlying correlations. In practice, more complex non-linear relationships exist between metrics. Moreover, most inter-metric correlations form clusters rather than simple pairwise correlations. These clusters provide additional information for error detection and offer the possibility for optimization. We address these issues by adopting the Normalized Mutual Information as a similarity measure. We also employ the entropy of metrics in clusters to monitor system state. Our approach does not require learning specific correlation models, thus reducing computation overhead.

We have implemented the proposed approach and show, through experiments with a multi-tier enterprise software system, that it is effective. Our evaluation shows that (i) stable non-linear correlations exist in practice; (ii) the entropy of system metrics in clusters can efficiently detect anomalies caused by faults and provide information for diagnosis; and (iii) we can detect errors which were not captured by previous linear-correlation approaches.

Using Economic Models to Allocate Resources in Database Management Systems Mingyi Zhang, Patrick Martin, and Wendy Powley, Queen's University; Paul Bird, IBM Toronto Lab

Resource allocation in database management systems is a workload management process in which an autonomic DBMS makes resource allocation decisions based on properties like workload business importance. We propose the use of economic models to guide the resource allocation decisions. An economic model is described in terms of business concepts and has been successfully applied in computer system resource allocation problems. In this paper, we present an approach that uses economic models to allocate multiple resources, such as main memory buffer space and CPU shares, to workloads running concurrently on a DBMS. The economic model enables workloads to meet their service level objectives by allocating resources through partitioning the individual DBMS resources and making system-level resource allocation plans for the workloads. The resource allocation plans can be dynamically changed to respond to changes in workload performance requirements. Experiments are conducted on IBM® DB2® to verify the effectiveness of our approach.

NetPal: A Dynamic Network Administration Knowledge Base Ashley George, Adetokunbo Makanju, Evangelos Milios and Nur Zincir-Heywood, Dalhousie University; Markus Latzel and Sotirios Stergiopoulos, Palomino System Innovations Inc.

NetPal is a web-based dynamic knowledge base solution designed to assist network administrators in their troubleshooting tasks, in recalling and storing experience, and in identifying new failure cases and their symptoms. In the context of web hosting environments, NetPal exposes and summarizes network data and organizational experience for system administrators. The system design draws on a variety of domains including knowledge management, information retrieval, machine learning and network management. The system architecture, user interface design, user software testing and future directions for development are described.

Back to top Session 8: Software Engineering III SIFT: A Scalable Iterative-Unfolding Technique for Filtering Execution Traces Best Student Paper A. V. Miranskyy and N. H. Madhavji, University of Western Ontario; M. S. Gittens, University of the West Indies; M. Davison, University of Western Ontario; M. Wilding, D. Godwin and C. A.Taylor, IBM Canada Ltd.

Comparing program execution traces can be useful for numerous purposes, such as software testing, system security analysis, program comprehension, software evolution and other areas of software development. Unfortunately, trace comparison techniques that operate on execution traces containing full execution details are too slow for use in large-scale production system environments. In order to speed up the comparisons, we propose a technique (called SIFT) for "filtering-out" irrelevant traces from a given set so that only the relevant few, residual, traces are then used in for comparison. Our solution involves multiple levels of trace compression, each with a different degree of abstraction. These traces are compared iteratively while filtering out dissimilar traces. This paper describes the compression and comparison algorithms. Prototype results from a significant case study show that the SIFT approach is efficient and scalable for use in an industrial software development environment.

An Architecture for Providing Context in WS-BPEL Processes Allen Ajit George and Paul A.S. Ward, University of Waterloo

WS-BPEL business processes are increasingly used by organizations to automate their business activities. As the pace of change in an organization increases, these processes will be required to be more flexible; to do so they will have to ac-count for an increasing amount of changing environment state, or context. This poses significant challenges for WS-BPEL programmers, who have to source, track and update context from multiple entities in addition to implementing and maintaining core business logic. In this paper we present a solution to this problem based on the definition and use of context variables. We describe how context variables can be constructed using the WS-BPEL language extension mechanism, and outline standards-compliant ways to represent, source, and propagate context in a web-services environment. We also present additional enhancements to WS-BPEL that will increase the utility of context variables and offer WS-BPEL programmers new ways of interacting with environment state. We have implemented a prototype realizing our approach and present a purchase-and-shipping scenario as an example of its use.

Is it a Bug or Enhancement? A Text-Based Approach to Classify Change Requests Giuliano Antoniol and Kamel Ayari, École Polytechnique de Montréal; Massimiliano Di Penta, University of Sannio; Foutse Khomh and Yann-Gaël Guéhéneuc, Université de Montréal

Bug tracking systems constitute valuable assets for managing maintenance activities. They are widely used in open source projects and also adopted in the industry. Despite their name, however, bug tracking systems collect any kind of issue project contributors and users might need to discuss: requests for enhancements, refactoring/restructuring activities, and even organizational issues. Unfortunately, most of the existing bug tracking systems do not have a specific field to explicitly classify the kind of issue reported. Truly they have fields to indicate issue severity that people should also use to label enhancements; however a manual inspection revealed this is not always the case. This paper investigates whether the text of issues posted on bug tracking system suffices for a classification of issues between corrective maintenance (bug) and other activities. In particular, the paper shows how data mining techniques, and in particular classification trees, naive Bayes classifiers and logistic regression can be used and even combined to perform issue classification. Results from empirical studies performed on issues of Mozilla, Eclipse, and JBoss indicate that both bugs and other issues can be classified with a precision between 70 and 92% and a recall above 60% and up to 98% in almost all cases.

Back to top

Navigation and Resources

CASCON 2008 Resources Related Links CASCON Events Registration & Sign-in Hotel information IBM University Relations CASCONcamp demo camp Full Papers Sponsors Programming Contest High school competition CASCON 2008 Program Contacts Central Committee

About IBM Privacy Contact Terms of use Hands-On: Project Zero Workshop

Hands‐On: Project Zero Workshop

Workshop Chairs: Bart Stanczyk Kalvin Misquith

TABLE of CONTENTS

Agenda ……………………………………………………………………………………….…………………. 2 Presentation Slides ………………………………………………………………………..………….……. 3 Exercise 1 – Introduction to sMash’s Application Builder …………………………...……….. 10 Exercise 2 – Creating the form for Gas Efficiency Calculator ……………………………….… 15 Exercise 3 – Creating the ZRM, Data store and Data grid ……………………………………… 31 Exercise 4 – Assemble flow ………………………………………………………………………………. 39 Exercise 5 – Adding security to the Gas efficiency calculator application ……………….. 51

Page 1 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Agenda

Time (p.m.) Activity

1:00 ‐ 1:20 Introduction

1:20 ‐ 1:45 Exercise 1 Introduction to App builder

1:45 ‐ 2:30 Exercise 2 Creating a form, DOJO

2:30 ‐ 2:45 Break

2:45 ‐ 3:15 Exercise 3 Zero Resource Model, Data store and Data grid

3:15 ‐ 3:30 Break

3:30 ‐ 4:15 Exercise 4 Assemble Flow

4:15 ‐ 4:35 Exercise 5 [if time permits] Security / Authentication

4:35 ‐ 4:45 Conclusion

Page 2 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Presentation Slides

______Project Zero/ WebSphere sMash ______Hands-on workshop ______

______Agenda Time (p.m.) Activity ______1:00 - 1:20 Introduction 1:20 - 1:45 Exercise 1 Introduction to App builder ______1:45 - 2:30 Exercise 2 Creating a form, DOJO 2:30 - 2:45 Break ______2:45 - 3:15 Exercise 3 Zero Resource Model, Data store and Data grid 3:15 - 3:30 Break 3:30 - 4:15 Exercise 4 ______Assemble Flow 4:15 - 4:35 Exercise 5 [if time permits] Security / Authentication ______4:35 - 4:45 Conclusion ______

Page 3 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

______Goals ______• Objective: Walk you through some of the fundamental features provided by WebSphere sMash for agile and simple creation of web applications ______

• Part 1: Explore the web-based app builder interface and ______learn the base concepts and file structure of a sMash application ______• Part 2: In-depth look into some of the sMash features with accompanied exercises. ______

______

______What is WebSphere sMash? ______• sMash is Community-Driven Commercial development of IBM’s WebSphere. • Project Zero - development community for WebSphere sMash - ______www.projectzero.org • sMash vs. Tomcat vs. WebSphere App Server vs. others – What you can do ______• Develop and execute entire web applications using a web based environment • Quickly (and easily) expose RESTful services and ______integration mash-ups to the web • Use dynamic scripting languages (PHP, Groovy) to handle events ______• Make use of dojo-aware assembly tools ______

______What does sMash have to offer? ______• Speed – Dynamic Scripting Languages (Groovy and PHP) – An integrated runtime environment (Server + development software) ______– Agile applications that perform and scale (Can be managed by IBM WebSphere Extended Deployment) • Simplicity – REST services expose and leverage pre-existing content ______– Assembly style development produces fast composite applications (by assembling existing services and feeds) • Agility ______– End to end development and run time environment (fewer development roles) – Component style development and delivery (reusable building blocks) – Integrated environment to manage agile applications (cost effective) ______

______

Page 4 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

______Benefits ______• Agile, simple and fast development of RESTful applications. ______• Minimum configuration required (dependency resolution) ______• Quick and easy installation • Support for popular dynamic scripting languages like PHP and Groovy. ______• Allows for rapid aggregation of disparate services and feeds ______

______Concepts ______• Global context: central construct to store and retrieve environment information – Shared by ALL threads of an HTTP session for a particular ______application • Security for web applications: examples include basic, form- based, open-id authentication. ______• Response rendering: renders files to the output stream – View, error, JSON, XML – Atom and RSS support ______• Configuration (zero.config) : properties and behaviour of sMash • Dependencies : external modules that an application depends on • Event handling using implicit or explicit script handlers ______

______Developing project zero applications ______• Command Line Interface (CLI) ______

• WebUI Eclipse ______

______

Page 5 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

______Technologies ______• Scripting languages – Groovy, PHP • REST ______• Transportation/representation of data – JSON, XML • Frontend – DOJO (AJAX) ______• Database access – IBM pureQuery , JDBC – Derby, DB2, MySQL, Oracle & others ______

______

______World Wide Web Resources ______• Representational State Transfer (REST) – An architectural style of exposing resources over the internet HTTP CRUD ______POST Create, Update, Delete • Principles: GET Read Create, ______– Client-server requests PUT Overwrite/Replace – Stateless requests (applies only DELETE Delete to request, does not necessarily mean there is no state on ______the middle tier) – All resources accessed using HTTP CRUD operations – Resources that are named using a URL, (i.e. HTTP URL) ______

______REST & sMash ______• Conventional access of resources: – /resources/[/[/]] • Handling a REST request: GET /resources/people triggers onList() ______Handler (resources/people.groovy): def onList() { def data = zero.data.groovy.Manager.create('peopleDB') ______def result = data.queryArray('SELECT * FROM people') request.view = 'JSON' request.json.output = result render() } ______• Other examples: GET /resources/people/100 POST /resources/people PUT /resources/people/100 DELETE /resources/people/100 ______

______

Page 6 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

______Exercise 1, 2 ______• 25 min. • 45 min. ______

______Zero Resource Model (ZRM) 1 ______• Resource model definition • CWPZC9212I: Created table -> JSON example ‘persons.json’ PERSONS for Type -> persons using { "fields" : • CREATE TABLE persons ( ______{ first_name VARCHAR(40), "first_name": { age INTEGER, ______"type":"string", "max_length":40 id INTEGER PRIMARY KEY }, GENERATED BY DEFAULT AS “age":{ ______IDENTITY (START WITH 100, "type":"integer" INCREMENT BY 1) NOT NULL, } updated TIMESTAMP NOT NULL } ) ______} ______

______Event Driven Processing ______• Firing event = REST API invocation (GET, POST, etc) ______• Causes sMash to invoke associated handler – Implicit Registration :Naming convention ______– Explicit Registration : Specified in configuration file – Dynamic registration : Special event fired to resolve handlers ______• Handlers can be default or user defined (Groovy, Java, PHP) ______

______

Page 7 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Event lifecycle example for HTTP request ______[show this] ______Request End Request Begin ______Log Secure ______GET/ PUT/ POST / DELETE ______

______

______Exercise 3 ______• 30 min. ______

______Assemble flow ______• Constructing a feed style application that processes and aggregates a set of feeds from different sources. ______• Naming convention .flow

______ ______ ______ ______

Page 8 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

______Exercise 4, 5 ______• 45 min. • 15 min. ______

______Installation ______• Download the zero.zip file and unzip it to any directory ______• JDK 5.0 (or greater) must be installed • Add zero directory to user PATH environment variable ______• Add bin directory of JDK to user PATH environment variable ______• For app builder run the “appbuilder” script ______

______

Page 9 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Exercise 1 – Introduction to sMash’s Application Builder [25min]

This exercise will give you an introduction to the web‐based development environment of sMash called App builder. App builder allows us to develop complete sMash applications using only a web‐browser. It provides various development features like dependency resolution, color‐coded scripting, drag‐and‐drop functionality and application management all in one package.

In this exercise we will also start building our Gas Efficiency Calculator application. This application will calculate for us the cost of gas for a set of user defined criteria such as the number of liters pumped for a given distance travelled. The application will grow as we move through the exercises and in exercise 1 we will mainly concentrate on getting familiar with the environment and application structure of sMash.

Step 1: Starting the app builder

1. Open the Command Prompt – On Windows click on Start, Run… , type cmd and press Enter 2. Navigate to the c:\zero folder by typing cd c:\zero and pressing Enter 3. Start the app builder by typing appbuilder start 4. Launch the app builder in the browser by typing appbuilder open This brings up the app builder web interface and you initially see the Application view.

Page 10 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The application view allows us to manage all the applications within our workspace. From here we can launch, stop, edit, create, delete, or move our applications. We may also import existing applications from the Actions menu (see image above).

Step 2 – Creating a new application

1. In the My Applications view, click Create new application from the Actions menu.

2. Enter Gas Efficiency Calculator in the name field and leave the Root directory default. 3. Click the Create button.

The Gas Efficiency Calculator should now appear on the list of applications on the My Applications page.

Step 3 – Exploring the application structure

1. In the My Applications page, click on the Edit ( ) icon from the application options menu. This will take us into the application development perspective for this particular application. 2. The default tab that you will initially see highlighted is the File editor tab. On this page you can view a list of files or create a new file by clicking the New File ( ) button (all of this is on the left hand pane). 3. You will also initially see 3 generated files in the All Files list

ivy.xml – dependency management/configuration file php.ini – PHP configuration file in case you want to use PHP scripts in your application

Page 11 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

zero.config – the main project zero configuration file where you can define settings like security, application port, runtime settings, and others. 4. Click on each file to bring up its content on the editor to the left and briefly inspect its structure. 5. Click on the Dependencies tab (to the right of the File Editor tab) to bring up the dependency management page. Typically, you won’t need to do much configuration here because app builder will prompt you to resolve dependencies when needed. Initially, you should see on the list of current dependencies. This is the core sMash/Project‐zero engine.

Step 4 – Creating the front‐end html page

1. Click on the Explorer tab to bring up the Explorer view. The entire application directory hierarchy can be found on the left‐hand side (see image).

The app folder contains resources used by your application such as scripts, event handlers, views, model definitions and others. The classes folder contains compiled JAVA classes. The config folder contains configuration files (such as the aforementioned ivy.xml, php.ini and zero.config). The java folder contains java source files (ie MyClass.java). The public folder contains your publicly accessible files like html pages, javascript files, css files and others. 2. Select the public directory and on the top menu bar, click on New File. A New File dialog pops up. Enter /public/index.html.

3. Select the new index.html file and click the Edit button from the toolbar menu. This will open up the visual html editor. sMash will prompt you to resolve a dependency (dojo.1.1+) . Click Add. Whenever you see similar dependency resolution prompts in the future, click Add to add them.

Page 12 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The visual html editor allows us to drag and drop various components from the toolbar on the right onto our work area. The components under the Create tab ( ) are various HTML and DOJO components whereas the Data tab ( ) contains data components which can be defined but will not be visible on the page (such as a Data Store). 4. Use the filter ( ) textbox to find the Label html component. Type in label into the textbox. A label ( ) component is instantly filtered out from the group. Drag and drop this label component into the top center of the page. Note: If you are prompted with a message informing you that the front‐end file and the server file are out of sync, click Overwrite. 5. Right‐click on the Label component on the page and select Properties from the menu. This brings up the Properties dialog. 6. Set the ID to titleLabel and change the Text to GAS EFFICIENCY CALCULATOR. Click OK. 7. Right‐click on the component again and this time select Styles. This brings up the Styles dialog where you can specify styling properties for your component. We will increase the font size and change the font weight to bold.

8. Select the text button ( ) and change the font‐weight property to bold and the font‐size property to 14px. Click OK. Note: You do not need to save your progress because it will automatically get saved each time you make a change. 9. Click on the Source tab on the bottom left hand side of the work area ( ) to toggle source view which will allow us inspect the html code that was generated from the last few actions.

Step 5 – Running the application

To see our current application in action, let’s go ahead and run the application and test it in our browser.

1. Click on the Start icon ( ) on the top right‐hand corner to quick‐start our application. Alternatively, you can go to the My Applications view and start the application from there by clicking the Start icon in the application options for our application.

Page 13 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

2. Open up a new browser window (or tab) and point it to http://localhost:8080/index.html A blank page with the label GAS EFFICIENCY CALCULATOR should display.

Step 6 – Uploading files

In the next exercise we will create a form for our GAS EFFICIENCY CALCULATOR and validate it using some javascript code. In this step we will upload the javascript source file – formChecker.js – into our application. This works like a file import, but because app builder is a web‐based application we are actually uploading the file.

1. To upload a file, open up the Explorer view by clicking the Explorer tab. Click the Upload File button ( ) to bring up the File Upload dialog. Select the formChecker.js javascript file from the WorkshopResources folder on your workstation file‐system and specify the application directory to be /public (you can select this from the drop‐down menu). Click the Upload button. 2. Once the file is uploaded, its contents are immediately displayed in the explorer view. 3. Open index.html for edit by clicking on it from the All Files list in the File Editor view ( ) or select it from the file hierarchy in the Explorer view and click the Edit button on the top toolbar. 4. Once the file is opened for edit in the design view, right‐click on any white empty space on the page and click Properties (see image on right). 5. Click the Script button and under External Scripts: click the Plus icon ( ) to add a script to our page. An empty textbox will appear. Enter formChecker.js. Click OK.

6. Stop the application by clicking the Stop icon ( ) on the top right‐hand corner (or in My Applications view) to stop our application.

Page 14 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Exercise 2 – Creating the form for Gas Efficiency Calculator [45 minutes]

This exercise focuses on creating the form component of our Gas Efficiency Calculator application. This form will allow the end‐user to insert a new gas efficiency record to the data store which will be created in exercise 3. The form allows the user to specify:

a) The driver of the vehicle b) The make of the vehicle c) The type of gas pumped into the vehicle (Gasoline or Diesel) d) The amount of gas pumped into the vehicle (Liters) e) The total distance traveled by the vehicle (Kilometers) f) The date at which gas was pumped into the vehicle

In exercise 4, with the help of the flow editor, we will use the data specified in points c to f to calculate the total amount of gas pumped into the vehicle and the cost efficiency (cost per 100 km) of the vehicle.

By the end of this exercise, your Gas Efficiency Calculator Application should look like:

Page 15 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

By completing this exercise you will:

1) Gain an understanding of how to create the front‐end of a web application by using sMash's drag and drop dojo features. 2) Learn how to customize dojo widgets (properties, style and events) through sMash's application builder.

Step 1: Creating the form widget

1. Browse to sMash's file editor by clicking the File Editor tab. 2. From the Recent Files drop down box, select the index.html (public/index.html) page you created in exercise 1. 3. Ensure that you are in the design view of the file editor. 4. Within the ‘Form’ folder of the Create tab, select the Form widget ( ) and drag it onto the canvas. 5. You can resize the form by dragging its resizing points ( ) located at the centre and end points of it borders. By clicking on the border portion between the form's resizing points, you can reposition the form on the canvas. 6. Let’s manually set the form's position and size, to ensure that you have enough space for widgets inside and outside the form. Right‐click anywhere within the form and click Styles ( ). 7. Set the following attributes on the Position tab leaving the others blank: position: absolute top: 300px left: 70px 8. Set the following attributes on the Size tab leaving the others blank: width: 550px

Page 16 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

height: 300px 9. Let’s make the form's border more visible. Set the following attributes on the Border tab: border‐width: 3px border‐style: double 10. Click the 'OK' button of the styles dialog and view the form on the canvas with the newly specified style properties. The space between the form and “GAS EFFIECIENCY CALCULATOR” label is left for the data grid which will be created in exercise 3. 11. Let’s give the form a Name and ID. Right click anyway on the form and click Properties. Specify gasForm in the Name and ID fields and then click OK. 12. Take a look at the source code of the form by browsing to the source section of the file editor (by clicking the source tab). Note the 'dojo.require("dijit.form.Form");' statement within the script tag in the head. Within the body, you will find the dojo form widget with the specified properties and styles attributes.

Step 2: Creating the validation text box widgets

The form will have two fields through which the end‐user can specify the first and last name of a vehicle's driver. Error checking will be enforced by using dojo's validation text box widget.

Part A – Creating the validation text box field for the driver’s first name

1. If you have not already done so, browse to the design section of the file editor for index.html (public/index.html). 2. Drag and drop the Label widget ( ) to the top‐left portion of the form and set the following Properties:

Page 17 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

3. Drag and drop the Validation Text Box ( ) widget to the immediate right of the label “Driver's First Name:” If necessary, move the Validation Text Box widget so that it aligns horizontally the label. 4. Right‐click the validation Text Box and click Properties ( ). On the Validation Text Box tab specify the Name and ID as firstNameText. 5. Select Text for the Type attribute. 6. In order to specify a specific order of selection for when the end‐user presses the tab key, set the Tab Index to 1. 7. To remove leading and trailing white spaces, ensure that the Trim check box is checked. 8. To make sure that the first name is always in proper case, ensure that the Proper Case check box is checked. 9. Click on the Validation tab of the properties dialog. 10. In order to make the first name field mandatory, ensure that Required checkbox is checked. So if the end‐user does not specify any characters in this field, the widget will treat it as invalid input. 11. We would like the driver's first name to consist only of English characters from 'a' to 'z' irrespective of case. To set this condition, specify [a‐zA‐Z]* in the Regular Expression field. 12. We will leave the Constraints field with its default value. (The locale will be set to the validation textbox’s default locale.) 13. A ‘Prompt Message’ is the message the end‐user will receive as he/she begins to type into the validation textbox. Set the Prompt Message field to Please enter the first name of the driver. 14. An Error Message is the message the end‐user will receive as he/she types invalid data into the validation text box. (For example, a number does not fit our regular expression and will make the widget display the error message). Set the Error Message field to Please enter a valid name. 15. The widget's prompt and error messages will be displayed in a tooltip. To ensure that this tooltip does not collide with any of the other widgets within the form, type top in the Tooltip Positions field. Click OK. 16. Right‐click the validation textbox and click Styles( ). Limit the size of the textbox by clicking on the Size tab and specifying the Width as 100px. Leave the other style attributes with their default values.

Part B – Use copy‐paste to create the validation text box field for the driver’s Last name

1. Drag and drop another label widget to the immediate right of the first name validation Text Box and set the following properties:

Page 18 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

2. Since the validation required for the Last Name Textbox field is similar to the First Name Textbox field, let’s re‐use the First Name validation Text Box widget. Right‐click on the First Name validation Text Box widget and click Copy( ). 3. Right‐click anywhere on the canvas and click Paste( ). The mouse pointer will change to a plus sign in order to symbolize that you need to paste the validation textbox widget at a desired position. Paste the widget by clicking on a point to the immediate right of the Last Name Label widget. 4. If necessary, re‐position the widgets within the form so that they are on the same horizontal plane. If there is not enough space, you can resize the width of the form. 5. The pasted validation Text Box widget is the same as the First Name Validation Text Box widget (has the same properties and style but no id), so the ID and some properties of the pasted validation textbox widget must be changed. Right‐ click the pasted validation textbox and click Properties. On the Validation Text Box tab specify the Name and ID as lastNameText. 6. Set the Tab Index to 2. 7. On the Validation tab, specify the prompt message as Please enter the last name of the driver. Leave the other attributes as‐is. Click OK.

You can view the source code of your changes by clicking the source tab of the File Editor. Run the server (if it’s not already started) and open up a new tab in Firefox. Browse to http://localhost:8080/ to view and test your changes. On clicking a Validation Text Box widget, you should see a prompt message. You should receive no errors on entering alphabets but should receive an error message on entering any other character. If the text fields are left empty, you will see a warning triangle ( ) on the side.

Step 3: Creating the Number Text Box widgets

Page 19 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The form should allow the end‐user to specify the amount of gas pumped into the vehicle and the distance traveled. Error checking will be enforced by using dojo's Number Text Box widgets.

Part A – Creating the Labels for the Number Text Box fields

1. If you have not already done so, browse to the design section of the file editor for index.html (public/index.html). 2. Drag and drop a Label widget ( ) below the “Driver’s first name” label widget and set the following properties:

3. Drag and drop another Label widget ( ) below the gas amount label widget and set the following properties:

Part B – Creating the Number Text Box field for the Amount of Gas Filled

1. Drag and drop a Number Text Box widget ( ) to the right of the “Amount of Gas Filled:” label and set the following Number Text Box properties:

2. Click on the Validation tab of the properties dialog. In order to make this field mandatory, ensure that Required checkbox is checked. 3. The end‐user input should be constrained so that the value entered is not too high or negative. You can add constraints to the Text Box by specifying a JSON

Page 20 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

object of constraints in the Constraints field. Specify the maximum and minimum values of the Number Text box by typing {"max":999,"min":0,"locale":""} in the Constraints field. 4. Specify the Prompt Message to be Please enter the amount of gas you filled into the vehicle (L). 5. Specify the Invalid Message to be The value entered is not valid. 6. If the end‐user enters a value not within the max‐min constraints, a ‘Range Message’ is displayed. Specify the Range Message to be This value is out of range. 7. To make sure the tooltip of the Number text box does not collide with any other widget, specify the tooltip position to be after. Click OK. 8. The width of the Number Text box should be set to an appropriate value. Right‐ click on the number text box and click Styles. Click on the Size tab and specify the width to be 70px.

Part C – Creating the Number Text Box field for the Distance Traveled

1. Drag and drop another Number Text Box widget to the right of the gas filled label and set the following Number Text Box Properties:

2. Set the following Validation properties:

Page 21 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

3. The width of the Number Text Box should be set to an appropriate value. Right‐ click on the Number Text Box widget and click Styles. Click on the Size tab and specify the width to be 90px.

Part D – Adding Post fix labels

Units for the ‘Amount of Gas Filled’ (L) and the ‘Distance Traveled’ (KM) can be specified by adding labels just after the corresponding number Text box.

1. Drag and drop a label to the left of the Gas filled number text box and set the following Properties to it:

2. Drag and drop a label to the left of the Distance Travelled Number Text Box and set the following Properties to it:

You can view the source code of your changes by clicking the source tab of the file editor. Run the server and open up a new tab in Firefox. Browse to http://localhost:8080/ to view and test your changes. On clicking the number textbox widgets, you should see a prompt message. You should receive no errors on entering positive numbers (below 1000 for the gas filled widget) but should receive an error message on entering negative numbers and any non‐numeric character. If the number text fields are left empty, you will see a warning triangle ( ) on the side.

Step 4: Creating the Date Text Box widgets

Page 22 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The form should allow the end‐user to specify the date when the vehicle was filled with gas. For this purpose, the Date Text Box will be ideal as it allows the end‐user to select a date from a calendar, checks if the date is allowed and is in proper format. 1. Drag and drop a Label widget ( ) beneath the “Distance traveled” label and set the following Properties:

2. Drag and drop the Date Text Box widget ( ) to the right of the “Date of gas fill up” Label and set the following Date Text Box properties:

3. Set the following validation properties:

Page 23 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

4. In exercise 4, we’ll be using an RSS feed which only contains dates for a certain period. Thus you will have to constraint the user to select dates within the last four months (excluding today). Since this constraint period is dynamic (i.e. 4 months from the current date), you’ll have to add a page onLoad script which is executed when the page is loaded. This script will set the max and min constraints of the Date Text Box. Right‐click on a portion of the canvas not occupied by the form or any other widget and select Events ( ).

5. On the events dialog box, make sure the Load tab ( ) is selected. Click on the add drop down button ( ) and click script ( ). 6. In the script text box that appears add the following code or copy‐paste it from the solution file(WorkshopResources/Exercise 2 ‐ Step 4 ‐ onLoad Script.txt): var today = new Date; var minDate = dojo.date.add(today, "day", ‐120); var yesterday = dojo.date.add(today, "day", ‐1); dijit.byId("dateText").constraints.min = minDate; dijit.byId("dateText").constraints.max = yesterday;

The above script sets the max constraint to one day before the current date and min constraint to 120 days before the current date.

7. The RSS feed used in exercise 4 specifies dates that fall on Saturday, Sunday and Monday as a period (For example, the date for Sunday, the 26th of October is specified in the RSS feed as “25‐27th October”). To conform to this format, would require some javascript string manipulation which is beyond the scope of this workshop. Let’s add a script that prevents the end‐user from selecting Saturday, Sunday or Monday. This script will be invoked after the user has selected a date. Right‐click on the Date Text Box widget and click Events ( ). 8. On the events dialog box, make sure that the Blur tab ( ) is selected. Click on the add drop down button ( ) and click script ( ). 9. In the script text box that appears add the following code or copy it from the solution file (WorkshopResources/ Exercise2 ‐ Step 4 ‐ onBlur Date Script.txt): var date = dijit.byId("dateText").getValue();

Page 24 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

if (date != null) { var day = date.getDay(); if (day == 0 || day == 1 || day == 6) { alert("Sorry, days: sat, sun and mon are not supported, please choose another day"); dijit.byId("dateText").setDisplayedValue(""); } } 10. Set the width of the Date Text box to an appropriate value. Right‐click on the Date text box and click Styles. Click on the Size tab and specify the width to be 90px.

You can view the source code of your changes by clicking the source tab of the file editor. Run the server and open up a new tab in Firefox. Browse to http://localhost:8080/ to view and test your changes. On clicking the Date Text Box widget, you should see a calendar. You should receive no errors on choosing a date (which falls on days Tuesday through Friday) from the calendar but should receive an error message on typing in an invalid date. If the date field is left empty, you will see a warning triangle ( ) on the side.

Step 5: Creating the basic form widgets

Part A – Basic Text Box Widget

Page 25 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The form should allow the end‐user to specify the make of the vehicle. Since no validation is required, a normal text box widget will be used. 1. Drag and drop a Label widget ( ) beneath the “Date of gas fill up” label and set the following Properties:

2. Drag and drop the Text Box widget ( ) to the right of the “Make of vehicle” label and set the following Properties:

Part B – Radio Button Widget

The form should allow the end‐user to specify the type of gas, i.e. Gasoline or Diesel. Radio button widgets can be used to solve this purpose.

1. Drag and drop a Label widget ( ) to the center‐right portion of the form. Right‐click on the Label widget and set the following Properties:

2. Drag and drop another Label widget ( ) beneath the “Type of gas” label and set the following Properties:

Page 26 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

3. Drag and drop another Label widget beneath the “Gasoline” label and set the following properties:

4. Drag and drop a Radio Button widget ( ) to the left of the “Gasoline” Label and set the following Properties:

5. Drag and drop another Radio Button widget ( ) to the left of the “Diesel” label and set the following Properties:

Step 6: Creating the form buttons

Page 27 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The form should have two buttons: One for resetting the form and another for sending the form data to a Data Grid which will be created in exercise 3.

Part A – Creating the reset button

1. Drag and drop a Button widget ( ) to the centre – bottom part of the form. Right‐click on the button widget and select Properties. 2. In the properties dialog of the button widget set the ID and Name attributes to resetButton. 3. To make this button automatically reset all fields in the form, change the Type attribute to reset. 4. Set the Tab Index to 10. 5. Set the Label attribute to Reset Form. Make sure the Show Label checkbox is checked. Click OK.

Part B – Creating the “Add to store” button

1. Drag and drop another Button widget ( ) to the left of the Reset button widget. Right‐click on the dropped button widget and select Properties. 2. In the properties dialog of the dropped button widget set the ID and Name attributes to submitFormButton. 3. A button of type ‘submit’, submits the form via a POST multi‐part request. A button of type ‘button’ acts as a normal button where you can specify your action as an onClick event. Since this application does not require a POST multi‐

Page 28 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

part request, change the Type attribute to button. 4. Set the Tab Index to 9. 5. Set the Label attribute to Add to store. Make sure the Show Label checkbox is checked. Click OK. 6. An onClick event should be attached to the button. Right‐click on the button and select Events ( ). In the Events Dialog make sure the Click tab

( ) is selected. 7. Click on the drop‐down button and select Script ( ). A script text box should be displayed. 8. In the script Text box type in the following script or copy it from the solution file (WorkshopResources/ Exercise 2 ‐ Step 6 ‐ Part B.txt):

if (!checkIds()) { alert("There is an error in one or more of the id widgets!"); return; } var isFirstNameValid = dijit.byId("firstNameText").isValid(false); var isLastNameValid = dijit.byId("lastNameText").isValid(false); var isGasFilledValid = dijit.byId("gasAmount").isValid(false); var isDateValid = dijit.byId("dateText").isValid(false); var isDistanceValid = dijit.byId("distanceText").isValid(false); var isNameValid = isFirstNameValid && isLastNameValid; var isFeedDataValid = isGasFilledValid && isDateValid && isDistanceValid; var isFormValid = isNameValid && isFeedDataValid; if (!isFormValid) { alert("One or more fields are invalid! Please fill out valid values"); return; } alert(“Congrats, All fields are valid”); //This statement will be removed in the next exercise

Let’s take a brief look at the script:

The checkIds method is defined in formChecker.js and returns true if the ID’s of all the input widgets on the index.html page match those defined in this document. Its use is primarily for debugging.

The dijit.byId(‘widgetId’).isValid(false) returns true if the validation widget with id : widgetId is in a valid state. The validation widgets on the index.html page are in a valid state when the end‐user has entered some non whitespace input and that input abides the constraints defined in the widget’s properties. Since we do not care if the widget is

Page 29 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

in focus when the “Add to store” button is clicked, a false attribute is passed to the isValid method.

The isFormValid variable is true, if for all validation widgets : dijit.byId(‘widgetId’).isValid(false) returns true.

9. Click OK.

You can view the source code of your changes by clicking the source tab of the file editor. Run the server and open up a new tab in Firefox. Browse to http://localhost:8080/ to view and test your changes. By clicking the ‘Reset Form’ button, all form widgets will be reset. If any of the validation widgets do not contain a non‐whitespace character or are displaying a warning triangle ( ) on the side, then on clicking the ‘Add to store’ button you will receive the following error message: "One or more fields are invalid! Please fill out valid values”. Otherwise on clicking the “Add to store” button, you will receive the success message: “Congrats, All fields are valid”.

Note: If you receive the error message “There is an error in one or more of the id widgets!” on clicking the ‘Add to store’ button, then one or more of the widget’s IDs do not match those described in the document. Please recheck all the widget’s IDs (You do not have to check the IDs of the labels).

Page 30 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Exercise 3 – Creating the ZRM, Data store and Data grid

This exercise focuses on creating the ZRM (Zero Resource Model), the ZRM data store and data grid for the Gas Efficiency Application. The ZRM acts as a back end database for the web application. The data store is a connector between the front end dojo and back end database (ZRM). In our application, the data store will consists of a collection of gas efficiency records. A gas efficiency record consists of: a) User‐entered information submitted by the form which was created in Exercise 2. b) The total cost of gas pumped into the vehicle. c) The cost efficiency of the vehicle. Cost efficiency is defined as the cost of driving the vehicle for 1 KM.

Since the calculation of total cost and cost efficiency is done in Exercise 4, they will be set to a default value of 0 for the purpose of this exercise. The data grid is a graphical representation of the data store.

After completing this exercise, you application will look like:

By completing this exercise you will:

1) Gain an understanding of how to create and instantiate a ZRM. 2) Learn how to define a ZRM data store based on a model and connect that store to a data grid. 3) Gain a perspective on the REST behavior of sMash by making REST calls to the data model.

Page 31 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

4) Gain an understanding of how to utilize the data grid in a web application.

Step 1 : Creating the JSON for the ZRM

Part A – An introduction to the JSON for the ZRM

1. A ZRM data store is defined by a JSON file in the /app/models folder. To create this JSON file, click on the New File drop down button ( ) which is located on the left panel of the File Editor. 2. Select the option “Zero Resource Model in /app/models”. A file create dialog box should appear. 3. On the file create dialog box, specify /app/models/gasRecords.json in the path and file name field. Click the Create button. If sMash prompts you for a dependency click the Add button. 4. Click on the link ‘gasRecords.json’ which should be listed in the recent files section on the left panel of the file editor. The design section of File editor page should display two sections: Fields and Filtered Collections. Filtered Collections can be used to limit the set of members of the ZRM being returned via a REST call. The filtered collection section will not be covered in this workshop. The Fields section contains an empty table with a column of data type buttons to its left. Each row of this table represents a column in ZRM. For the purpose of this exercise, a row must be defined for each field of a gas efficiency record which was defined at the start of this exercise. (For example, one row should be defined for the driver’s first name and another row should be defined for cost efficiency). 5. Browse to the source section of the File Editor. The default template for the ZRM’s JSON is an object with two fields corresponding to the two sections

Page 32 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

you saw in the design section, i.e. fields and collections (filtered collections). 6. Switch back to the design view of the file editor. A row can be inserted into the table by clicking a desired data type from the column of data type buttons. Let’s start by adding a row which represents the gas type of a vehicle (i.e. Gasoline or diesel). Click on the String data type button ( ). A row with a name ‘string 1’ should appear on table. 7. Click on the modify row button ( ) which located on the 2nd right most column of the row. An edit field’s properties dialog box should appear. 8. The Name acts as a unique identifier for a column of the data store. In the basic tab of the edit fields dialog box, specify Type_of_gas in the Name field. 9. The Label is the public identifier for a column of the data store. Specify Type of gas in the Label field. 10. To enforce that “Type of gas” is required on the data store, make sure that the Required check box is checked. 11. Set the Default value as Gasoline. Click OK. The field table should look like:

12. Browse to the source section of the file editor. Notice that a “Type_of_gas” object with the above defined properties has been added to the fields object. Navigate back to the design section of the file editor.

Part B – Complete the JSON for the data model

Complete the Fields table by inserting rows for each field of the gas efficiency record. The row for Type of Gas has been repeated over here as an example. Alternatively, you can copy the JSON from the solutions file (WorkshopResources\gasRecords.json) and paste it to the source section of the file editor (overwriting the old JSON).

Data Type Name Label Required Default Value String Type_of_gas Type of gas Yes Gasoline String First_name First Name Yes String Last_name Last Name Yes String Vehicle_make Vehicle Make No Date Date_of_fill_Up Date of Fill Up Yes Float Distance_travelled Distance Yes travelled Float Amount_of_gas_Filled Amount of Gas Yes Filled

Page 33 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Float Total_cost Total Cost No 0 Float Cost_efficiency Cost Efficiency No 0 ($/km)

Step2 – Instantiating the ZRM

Part A – Defining intial_data.json

1. The ZRM can be instantiated with values defined in the JSON file app/models/fixtures/initial_data.json. To create this JSON file, click on the New File drop down button ( ) which is on the left panel of the file editor. 2. Select the option “Other File”. A file create dialog box should appear. 3. On the file create dialog box, specify /app/models/ fixtures/initial_data.json in the path and file name field. Click the Create button. 4. Click the link initial_data.json which should be listed in the Recent Files section on the left panel of the file editor. 5. Type in the following JSON in the file editor or copy‐paste the contents from the solution file (WorkshopResources\initial_data.json): [ { "type":"gasRecords", "fields":{ "First_Name":"Kalvin", "Last_Name":"Misquith", "Vehicle_Make":"Datsun", "Type_of_gas":"Gasoline", "Date_of_Fill_Up":"2008‐10‐01", "Distance_travelled":12 } }, { "type":"gasRecords", "fields":{ "First_Name":"Bart", "Last_Name":"Stanczyk", "Vehicle_Make":"Acura TSX '09", "Type_of_gas":"Gasoline", "Date_of_Fill_Up":"2008‐11‐01",

Page 34 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

"Distance_travelled":24 } } ]

Let’s take a brief look at the above JSON. The JSON contains an array of two objects both of which belong to the ZRM ‘gasRecords’. Each object specifies a row of the ZRM. The above JSON specifies two rows for the ZRM: one row has a driver with first name “Kalvin” and last Name “Misquith” while the second row has a driver with first name “Bart” and last Name “Stanczyk”.

Part B – Enable REST calls to ZRM

1. In order to enable RESTful calls to the ZRM, REST handlers must be defined in the groovy file /app/resources/gasRecords.groovy. To create this groovy file, click on the New File drop down button ( ) which is on the left panel of the file editor. 2. Select the option “Resource Handler in /app/resources”. A file create dialog box should appear. 3. On the file create dialog box, specify /app/resources/gasRecords.groovy in the path and file name field. Click the Create button. 4. Click the link gasRecords.groovy which should be listed in the Recent Files section on the left panel of the file editor. 5. In the File editor page, you may see a template of automatically added functions. These functions are the handlers for the REST API (For example, the onList function is a handler for a GET request). 6. By invoking ZRM.delegate(), the REST handling is delegated to the Zero Resource Model REST API’s default rest handlers. Remove all the template handlers that sMash may have automatically added (i.e. Delete all the contents of the file). Then type in one line: ‘ZRM.delegate();’.

Step 3 – Create and Instantiate the ZRM through sMash’s Console.

1. Click on the Console tab located at the top of sMash’s web application. On the left panel make sure ‘Command Prompt’ is selected. 2. To create and instantiate the data store, type in ‘zero model sync’ in the command prompt. You should see the SQL used in creating and instantiating the data store. At the end of the Console display, you should see the message “Command model sync was successful”.

Page 35 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Note: To reset the model, you can type in ‘zero model reset’ in the command prompt. 3. Let’s test the REST API for the ZRM ‘gasRecords’. Run the server if it is not already started. Open up a new tab and browse to the URI: http://localhost:8080/resources/gasRecords . You should receive a JSON file which contains the two gas efficiency records you specified in intitial_data.json.

Step 4 – Creating the ZRM Store

1. Browse to the public/index.html page you created in exercise 2. Switch to the design view in the file editor. 2. Click on the Data tab ( ) located on the right side panel of the file editor. 3. Clic k on the new drop down button ( ) and select ZRM Store ( ). A dialog box should appear. 4. If sMash prompts you to add a dependency, Click OK. 5. A variable name is the unique identifier of the store. In the dialog box that appears, specify gasStore in the Variable Name field and gasRecords in the resource collection field (“gasReco rds” is the model created in part 2). 6. Click OK. The data store icon ( ) should then appear under the Data tab.

Step 5– Creating the data grid

1. Browse to the public/index.html page you created in exercise 2. Switch to the design section of the file editor. 2. Fro m the create tab of the right‐side panel, drag and drop a Data Grid widget ( ) to a point above the form you created in exercise 2. 3. Right‐click the form and select Properties. Type in thegrid in the id and name

Page 36 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

fields 4. The end‐user should not be able to modify data once it is on the grid. Make sure that the Read only check box is checked. 5. Specify the grid properties to be gasStore. 6. Specify the data store to be gasStore. By default all the columns of the data store will be displayed on the data grid. Since we do not have a unique key, the data grid will automatically add a new column specifying a unique ID. A “last updated” column which specifies the time stamp at which the row was updated is also automatically added. The ‘Visible fields’ property specifies which columns of the Data Grid should be visible. Due to time constraints, we will make the data grid display all the columns by leaving the Visible fields property blank. Click OK. 7. Manually set the position of the data grid so that the data grid appears above the form. Right‐click anywhere on the grid and select Styles. 8. In the Styles dialog box, click on the Position tab. Set the top and left properties to be 80px and 5px respectively. 9. Manually set the width of the data grid to ensure that there is enough space for all of the data grid fields. In the Styles dialog box, click on the Size tab. Specify the width and height to be 700px and 190px respectively. Click OK.

Step 6 – Add functionality to submit form data to the grid

1. Right click the “add to store” button in the form and click events. 2. In the Script Text box of the Click section, remove the statement: alert(“Congrats, All fields are valid”); 3. Add the following code to end of the Java Script or copy‐paste it from the solution file(WorkshopResources\Exercise 3 ‐ Step 6.txt): var newRow = new Object(); newRow.First_name = dijit.byId("firstNameText").getDisplayedValue(); newRow.Last_name = dijit.byId("lastNameText").getDisplayedValue(); newRow.Vehicle_make = dijit.byId("vehicleMakeTextBox").getDisplayedValue(); newRow.Amount_of_gas_filled = dijit.byId("gasAmount").getValue(); newRow.Distance_travelled = dijit.byId("distanceText").getValue(); var diesel = dijit.byId("diesel"); var gasoline = dijit.byId("gasoline"); if (gasoline.checked) { newRow.Type_of_gas = gasoline.getValue(); } else { newRow.Type_of_gas = diesel.getValue(); } var dateField = dijit.byId("dateText").getValue();

Page 37 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop newRow.Date_of_fill_up = dateField.getFullYear() + "‐" + (dateField.getMonth() + 1) + "‐" + dateField.getDate(); gasStore.newItem(newRow); gasStore.save({})

Let’s take a brief look at the added script. We define a new object newRow and set its parameters to the values of the form fields. Note that the names of the parameters of newRow object is exactly the same as the ids of the columns specified in the ZRM. For example newRow.Distance_travelled corresponds to the column in the ZRM whose id is Distance_travelled. The newRow.Date_of_fill_up attribute contains the date in a format specified by the ZRM’s Date type (i.e. 2008‐06‐31). The “gasStore.newItem(newRow);” statement adds a new row to the gasStore while the “gasStore.save({})” statement ensures that the modified gasStore is synced with the gasStore on the server.

4. Click OK. The application can now be tested. You should be able to add rows to the data grid by filling out information in the form you created in exercise 2. The two fields: Total Cost and Cost Efficiency will have a default value of 0. The values of these two fields will be calculated in the exercise 4.

Page 38 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Exercise 4 – Assemble Flow [45‐60 minutes]

In this exercise we will use the assemble flow to construct a flow module that will combine data from various sources and use that data to produce some meaningful output.

To demonstrate this, we will use it to calculate the total cost of gas and vehicle efficiency based on the gas prices coming from an RSS feed and return the result in JSON format to the calling page.

What you will do in this exercise: • Start a new assemble flow module using the flow editor • Add handler components for HTTP POST requests • Integrate data coming from an RSS feed • Apply a filter to the data from the RSS feed

Page 39 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

• Write a JAVA back‐end program to handle the data coming from HTTP POST and the RSS filter

Note: You can replace any work you have done at any time with completed exercises.

Step 1 ‐ Creating a new assemble flow (.flow) file:

1. Navigate to the explorer view if you already haven’t by clicking the Explorer tab (right) 2. Create a new folder inside the public folder and name it flows. Create a new file in this folder named index.flow (note the extension, it is important as app builder uses this to associate the file with the assemble flow editor). Your directory structure should match the one on pictured on the right. (/public/flows/index.flow) 3. Select index.flow file in the explorer pane and click Edit on the explorer toolbar on top. 4. If you don’t already have it, app builder will prompt you to import the Assemble Flow dependency. Click the Add button.

The assemble flow editor:

The assemble flow editor allows us to access different services and assemble them together into a single Project Zero/sMASH application. We can use it to construct feed‐style applications that process and aggregate feeds from various sources or create conversational applications that coordinate interactions with various services.

For more information see Getting started with Assemble ‐ https://www.projectzero.org/download/doc/zero.doc.M5/zero.assemble.flow/Overview. html

There are a few categories in the right‐hand drag‐and‐drop components toolbar. In this exercise we are primarily concerned with the following two: Built‐in Activities: generic activities for use in our flow applications such as receiveGET, receivePOST, assign, action, replyPOST Feed Activities: components that allow us to gather and interact with feeds. They include feed, filterFeed, sortFeed, retrieveEntry, and others

Page 40 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Step 2 ‐ Creating a receivePOST activity:

The receivePOST activity allows us to consume an HTTP POST request sent to the process. The output is the value of the received message.

1. Drag and drop the receivePOST component (under Built‐in Activities) from the right‐hand side component pane onto the work area. 2. Select the default name text (receivePOST_0), change it to recPost1 and press Enter.

Step 3 ‐ Adding a feed:

In this step we will add a feed component into our flow module. The feed component will read the RSS xml from the URL and output the results.

1. Drag and drop the feed component ( under Feed Activities) from the right‐hand side component pane onto the work area. 2. Edit the feed component properties by clicking the Properties icon on the selected component.

Page 41 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

3. Enter gas_price for the name and http://www.mcteague.ca/RSS/GPT_Toronto.xml for the URL. This RSS feed contains entries of gas prices in the greater Toronto region by date. 4. Click OK.

Step 4 ‐ Testing the feed output:

To test the feed output, we can create a replyGET component so that when a browser GET request is issued, the feed output is displayed. 1. Drag and drop the replyGET component ( under Feed Activities) from the right‐hand side component pane onto the work area. 2. Click on the ‘Properties’ icon of the selected replyGET component to bring up the Properties dialog (image below). 3. Add another input parameter by clicking add more inputs below the name label. This will allow us to specify our own property parameter. 4. Change the name of the new parameter from [noname] to body by selecting the [noname] text. 5. Enter ${gas_price} inside the corresponding textbox. This will take the output of the gas_price feed component and send it to the body portion of the current 6. Set the content type to text/xml and click OK.

7. Test the output by starting the application (if it isn’t already running) and pointing any browser to http://localhost:8080/flows/ . The output should be the raw RSS feed itself. 8. Add a link from the gas_price feed component to the replyGET_0 as pictured in the following image below.

Page 42 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Step 5 ‐ Adding a feed filter to our feed:

1. Drag and drop the filterFeed component ( under Feed Activities) from the right‐hand side component pane onto the work area. 2. Edit the filterFeed component properties by clicking the Properties icon on the selected component. 3. Enter gasDateFilter for the name. 4. Change the name of the last parameter from [noname] to feed and enter ${gas_price} inside the textbox. This will tell the feed filter to get its input from the gas_filter feed component defined above. 5. Check the Advanced ( ) checkbox on at the top right corner of the Properties dialog. This will allow us to add a condition to our filter because we want to see the results for a desired date only. 6. Add a new condition ‐ click the Add a condition link. 7. You will be presented with 3 inputs. 1. select ‐ specifies which field/tag inside the RSS feed XML we want to compare our conditional statement against. We will leave it as atom:title 2. operator – The conditional operator to use. Select Contains Ignore Case from the list. 3. value – the value to compare against the value in atom:title. Enter September 18th, 2008 The pseudo‐code equivalent would be something like: if atom:title containsIgnoreCase(“September 18th, 2008”) then print result 1. Click OK inside the Edit Condition dialog to apply the changes. 2. Click OK in the Properties dialog to apply changes and close it.

Page 43 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Step 6 ‐ Testing the feed filter output:

To test the feed filter output, we can modify the flow of the components in our work area so that instead of pointing the output of the raw feed data into replyGET we pass it through the filter first. The result should look similar to the image below.

1. Select the link running from the gas_price feed to replyGET_0 (it should turn orange) and click the x‐shaped delete icon ( ). 2. Select the replyGET_0 component and bring up the Properties dialog by clicking the Properties icon ( ). 3. Change the body parameter value from ${gas_price} to ${gasDateFilter} and ensure the content type is set to text/xml. If the body parameter is not present, add one by clicking add more inputs.

Page 44 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

4. Click OK to apply changes and close the Properties dialog. 5. Test the output by starting the application (if it isn’t already running) and pointing any browser to http://localhost:8080/flows/ The output should be the RSS feed displaying only data from September 18th, 2008.

Step 7 ‐ Using POST data for filter parameters:

In this step we will change the hard‐coded date value inside the feed filter (‘September 18th, 2008’) to a value coming from a POST request parameter (remember the receivePOST component we defined earlier ‐ recPost1?). To test this scenario, we will need a Firefox plug‐in Poster (https://addons.mozilla.org/en‐US/firefox/addon/2691).

You can verify if Poster is installed by looking for a small ‘P’ icon at bottom right side of your Firefox browser ( ). Finally, because we are testing a POST request, we will need to redirect output from our filter to a replyPOST instead of a replyGET component.

We will start by modifying the feed filter gasDateFilter to accept a POST parameter date.

1. Select the feed filter component gasDateFilter and click the Properties icon to bring up the Properties dialog. 2. Click on the Edit icon under Actions in the Condition table (see image below).

3. Change the value from September 18th, 2008 to ${recPost1.date[0]} and click on OK to apply changes. 4. Click OK again to close the Properties Dialog.

Page 45 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Before we can test, we need to create a replyPOST component and give it the feed filter output.

1. Drag and drop the replyPOST component ( under Built‐in Activities) from the right‐hand side component pane onto the work area. 2. Modify the properties as shown on the image below. Note that body value is the same as it is in the replyGET component.

Step 8 ‐ Testing the use of POST data for filter parameters:

1. Arrange the flow of the diagram to match the image below. We simply redirected the output from gasDateFilter to replyPOST_0 instead of replyGET_0 and pointed recPost1 to gasDateFilter since we are providing it with a POST parameter date.

2. Delete the replyGET component named replyGET_0 – we don’t need it anymore. Select it and click the x‐shaped delete icon ( ). 3. To test our current flow we need to make a POST request to the actual page. The easiest way to test this is to use the Poster Firefox plug‐in. Bring up the Poster plug‐in by clicking the ‘P’ icon ( ) at the bottom‐right hand corner of your browser window.

Page 46 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The Poster plug‐in is a tool for interacting with web resources by allowing us to make HTTP requests and set various options (body, content‐type, etc…) 4. Once Poster is displayed enter the address of our flow page into the URL textbox ‐ http://localhost:8080/flows/index.flow 5. Under Actions in the first select box, choose POST 6. Under the second select box, choose Parameters and click on the GO button to the right side of that select box. This brings up the Parameters dialog. 7. Add a date parameter by entering date for name and September 18th, 2008 for value. Click Add/Change. Click Close. 8. In the same select box where you selected Parameters, choose Parameter Body and click on the same GO button. This ensures that we send the parameters in the body along with our POST request. Note that the content‐type automatically got changed to application/x‐www‐form‐ urlencoded and the string date=September%2018th%2C%202008 appears in the Content to Send area (see image on right). 9. Finally to send the POST request, click on the GO button on the right side of the POST select box (first select box under Actions) 10. This brings up the Response dialog with the response data and other information that was returned from the URL. 11. Verify that the XML returned is correct by ensuring that the tags contain the data for the date specified (look at the tag).

Step 9 ‐ Adding JAVA back‐end:

In this step we will add an Action component that will use a JAVA class to perform some operation on the output that we get from the filter and the receivePOST component. Action is a generic activity for calling a static operation inside a JAVA class.

Page 47 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

1. Drag and drop the Action component ( under Built‐in Activities) onto the work area, name it handleAction and arrange the flow of the diagram 2. Bring up the Properties dialog for the handleAction component (image below) and enter the following properties: a. name – should already be set to whatever you re‐named your Action component to (handleAction) b. operation – this is the name of the method which we call inside the JAVA handler class (more on this below) – calculateGas c. target – the target JAVA class that we are using to call our operation (more on this below) – com.ibm.cas.GasParser

The next set of properties are input arguments to the calculateGas method in GasParser.java. To add more inputs, click on the blue ‘Add more inputs’ link for each additional input. d. feedData – This is the filtered (by Date) data that’s coming out of gasDateFilter ‐ ${gasDateFilter} e. gasType – This value (and the following 2) is coming from the receivePOST component recPost1. These parameters will be coming from the Gas Estimator form ‐ ${recPost1.gasType[0]} f. gasAmount ‐ ${recPost1.gasAmount[0]} g. distance ‐ ${recPost1.distance[0]}

3. Click OK to close the Properties dialog. 4. Make sure you set the modify replyPOST_0 to get its input from our new handleAction component. Select replyPOST_0 and open the Properties dialog. Change the body property to ${handleAction}. If the body property is not visible, click on Add more inputs to add it.

Page 48 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The Action component uses the GasParser JAVA class and looks for it in the /classes folder. For this, we need to create a GasParser.java in the /java folder.

5. Create a new JAVA file names GasParser.java in the /java folder by switching to Explorer view, selecting the java folder on the left‐hand side folder hierarchy and clicking the New File button explorer toolbar on top. 6. Select the new file on the file list and click the Edit button on the explorer toolbar on top. This will open a text editor where the file class will go. 7. At this point you can navigate to the WorkshopResources folder on your workstation file‐system and copy and paste the source code for this class (The file should be named GasParser.java). Once pasted, app builder will color code and sometimes partially format the java code. Also, in mode cases the code will be compiled and the java class automatically stored in the /classes folder. To ensure that the JAVA file has compiled successfully, please go to the console view by clicking the Console

button ( ) and select build log ( ) from the menu on the left. You should see log related to compiling your JAVA class. It should be something like this:

CWPZT0600I: Command compile was successful

If you don’t see any output for a JAVA compile, select command prompt ( ) from the left menu and type in zero compile.

JAVA Code explanation

Page 49 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The JAVA code in this step performs the back‐end parsing and calculation of the total gas price given the parameters specified.

public static String calculateGas(org.apache.xerces.dom.ElementNS Impl feedData,String gasType,String gasAmount,String distance){

This is the method definition of calculateGas that was entered in the operation field for the Action component we defined previously. The 4 parameters here correspond to the 4 input parameters defined in the Properties dialog for the same component. Note that the feedData argument needs to be of type org.apache.xerces.dom.ElementNSImpl. This is the output from our gasDateFilter component and when handled by a JAVA class it needs to be of this specific type. This gives us the DOM element object so that we can easily parse it using the org.w3c.dom library.

feedNodes = feedData.getElementsByTagName("entry");

Here we retrieve the entry elements from the feedData coming in. With this, we can perform some String parsing to retrieve the Gasoline and Diesel prices.

total = (floatPrice*Float.parseFloat(gasAmount))/100;

result = "{total:"+ total +",efficiency:"+(total / Float.parseFlo at(distance))+"}";

return result;

Finally, we perform some calculations based on the parsed gas price data and the data from parameters passed to the method. Note that we return the string in JSON format so that it is easier for us to read/manipulate in JavaScript from the original calling function on the front‐end.

Step 10 – Modify the form file to call the flow:

1. Open the index.html file with the grid and form components [/public/index.html] and make sure you are in Design view 2. Right‐click on the ‘Add to store’ button and click on Events ( ). 3. For the Click event, remove the old script that’s there and replace it with the contents in WorkshopResources/Exercise 4 ‐ Step 10.txt. This is basically the same script as the previous one, but if you look at the bottom, you’ll see that there is a XHR POST request being called to the index.flow page. 4. Click OK to close the Events dialog. 5. Test the page by opening this URL in a new browser tab/window: http://localhost:8080/index.html

Page 50 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

Exercise 5 – Adding security to the Gas efficiency calculator application

sMash offers the following levels of security:

1. Basic authentication : Basic authentication is defined in RFC 2617 (http://tools.ietf.org/html/rfc2617). 2. Form based authentication: The full‐page form login that leverages external redirect. 3. Single sign‐on authentication: The URL based login that leverages HTTP request attributes that include support for security tokens such as LTPA. 4. OpenID authentication: OpenID consumer based authentication for third party authentication with an OpenID provider. 5. Programmatic login authentication: sMash provides an API‐based authentication model.

Adding security to your web application is as easy as adding few lines to the configuration file (/config/zero.config) and developing the front‐end of the security pages (if required).This workshop will focus on Basic authentication. In short, Basic authentication allows the web browser (or other client‐side program) to provide the user‐name and password (base‐64 encoding) within the request. The user registries can be maintained by using sMash’s File‐based user service which is a text file base user‐ registry. Note: Although File‐based user services are suitable as development‐time registries, it is recommended to use LDAP or Custom user services during production. For more information, read the sMash documentation on security (www.projectzero.org).

Upon completing this exercise, the gas efficiency calculator application will ask the end‐ user to authenticate him/her‐self. When they try to access the ZRM either through REST calls or through the front end data grid, the following authentication prompt is displayed:

Page 51 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

By completing this exercise, you will:

1. Gain an understanding on how set up file‐based user services through the CLI. 2. Get familiar with how to set up authentication by modifying the /config/zero.config file.

Step 1: Configuring the web application

1. Browse to the File editor page of sMash’s web application.

2. Type in ‘zero.config’ into the search filter ( ) which is located on the left side panel. 3. Click on the zero.config link from the list beneath the filter. 4. The configuration file should contain two statements: /config/http/port = 8080: The global context for specifying port. /config/runtime/mode="development": The global context for specifying your applications runtime mode. 5. At the end of the configuration file, type in the following lines: @include "security/rule.config" { "authType":"Basic", "conditions":"/request/path =~ /resources/.+", "groups":["ALL_AUTHENTICATED_USERS"] } Let’s take a brief look at the above statement:

Page 52 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

The @include "security/rule.config" specifies that the configuration statement is a security rule. The "authType" statement infers that authentication type is Basic. Other possible authentication types are “form” and “SOO” (Single sign‐on authentication). The "conditions" statement specifies the request paths that this authentication rule must be enforced. In this case, we are specifying all files within the resources folder which include the ZRM’s JSON. The "groups" statement specifies the user groups that are allowed to access the secured resources. “ALL_AUTHENTICATED_USERS” is a sMash keyword which includes all registered users irrelevant of their group.

You have now successfully enabled authentication. However as you have not created a user service, your application will have no user registry to which to check authentication. As a result, without creating the user service, authentication will fail for all possible user names.

Step 2: Creating the file‐based user service

1. Browse to the Console of sMash’s web application. Make sure the Command Prompt view is selected in the left‐side section. 2. In the command prompt, type in the keywords “zero user”. This command invokes sMash’s interactive mode for user services. You should see the following:

3. In order to use the default user file (/config/zero.users), click enter on the command prompt (without specifying a location for the user file). You should see the prompt for specifying the next action:

4. Enter “create” in the command prompt to create a new user. You should see then see the prompt for entering the user name:

5. Type in “user” in the command prompt to specify the name of the user. You should then see the prompt for entering the user password:

6. Type in “mypassword” in the command prompt to specify the user’s password. You should then see the prompt for entering the user’s group:

Page 53 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Hands-On: Project Zero Workshop

7. Since we do not want to specify any group, click enter. The command prompt should then display the following success message:

8. You have successfully created a user registry record with the name “user” and password “mypassword”. Type exit in the command prompt to end the interactive mode. You should see the termination message:

Step 3: View the user service (Optional)

1. Browse to the File Editor page of sMash’s web application.

2. Type in ‘zero.users’ into the search filter ( ) which is located on the left side panel. 3. Click on the zero.users link from the list beneath the filter. 4. The file based user service displays content in the following format: (User Name):( Base‐64 encoded password):(User’s Group) The zero.user file should display the following content: user:34819d7beeabb9260a5c854bc85b3e44 Note that the user group is missing since we did not specify one.

You have now successfully created Basic authentication for the ZRM. We’ll test the authentication by issuing a REST call for the ZRM ‘gasRecords’. Run the server if it is not already started. Open up a new tab and browse to the URI: http://localhost:8080/resources/gasRecords. The browser should ask you to authenticate yourself before you can view the ZRM’s JSON. Browse to the URI: http://localhost:8080/. As the data grid is mapped to the data store and the data store is a connector to the backend ZRM, you will not be able to view the data grid without authentication. However, as we have not secured the form, it will be displayed.

Page 54 of 54 © Copyright IBM Corporation 1994, 2008. All rights reserved. 11/14/2008 Web 2.0 meets SOA with Apache Tuscany SCA

Luciano Resende [email protected] Doug Tidwell [email protected] Agenda

• Service Oriented Architecture (SOA) • Introduction to Service Component Architecture (SCA) • Web 2.0 • Web 2.0 and SCA • How to get Involved Service Oriented Architecture (SOA)

• SOA is an architectural approach that promotes – Reusability: Existing business functions are reusable – Composition: Existing business functions are composed together to form new solutions – Flexibility : Business functions are flexible to change SOA Programming Model Evolution

ServiceService Component Component ArchitectureArchitecture (SCA) (SCA)  Business logic separate from More Flexibility Service call API, handled by WS infrastructure infrastructure Greater Agility Business logic separate from protocol handling, handled by Higher ROI StandardsStandards Based, Based, JAX-WS JAX-WS infrastructure Business logic separate from Multiple protocols supported. Service call API, handled by Protocol is a choice infrastructure Multiple technologies can be X Business logic mixed with protocol used in one solution handling  Manage as part of the X Single protocol enforced, for enterprise solution example HTTP HomeHome Grown, Grown, Proprietary Proprietary X Single technology stack enforced X Business logic mixed for a solution with Service call API and X Cannot manage as part of the protocol handling enterprise solution X Single protocol enforced X Cannot manage as part of the enterprise solution Brittle application Time The Store Scenario

• An online retail store with three main business Online Retail Store ShoppingCart functionality

– Product Catalog Store – Customer Shopping Cart – Front End web store application Catalog • Challenges – Business logic mixed with protocol handling – Single protocol enforced The Store Scenario using SCA

• An online retail store with

Collection three main business atom ShoppingCart functionality Currency Converter Store jsonrpc – Product Catalog Total currencyCode=USD http Fruit – Customer Shopping Cart Catalog

– Front End web store Catalog

application ws SCA Overview

Service Interface Reference Interface - Java Property - Java - WSDL - WSDL

Composite A

property setting

Component Component Service A B Reference

promote wire promote

Service Binding Implementation Reference Binding Web Service Java Web Service JMS BPEL JMS SLSB PHP SLSB Rest SCA composite Rest JSONRPC spring JSONRPC JCA EJB module JCA … … … SCA composition is simple and flexible • No more plumbing API calls in my application code! With SCA, it’s a matter of declaring bindings and switching at your will. • The business logic is not bound to a pre-defined deployment topology any more. RSS

Fruit Catalog WS WS Business logic Store Catalog ATOM ATOM Business logic Vegetable Catalog JSON-RPC Business logic Store Composition

Web 2.0

• Rich Internet Applications • Reusable components in form of widgets that can be mashed-up together to compose applications • Efficient communication with back-end services using AJAX technologies The Store Scenario as a Web 2.0 Application

• An online retail store with three main business ShoppingCart requirements

– Product Catalog Store – Customer Shopping Cart – Front End web store application Catalog • Still same challenges – Business logic mixed with protocol handling – Single protocol enforced Tuscany SCA and Web 2.0

• Web 2.0 bindings: ATOM, RSS, JSONRPC, DWR • Tuscany Widget implementation to represent web components and provide dependency injection for JavaScript • Other scripting implementation types

Collection atom ShoppingCart Currency Converter Store jsonrpc Total currencyCode=USD http Fruit Catalog

Catalog Vegetable ws Catalog Tuscany ATOM Binding

• The Atom Publish Protocol provides REST style access to data collections. • Services becomes available as Data Feeds trough ATOM Binding • Simplifies consuming and aggregating external feeds • Web 2.0 applications can consume these feeds using – Atom JavaScript Proxy – Regular Feed Readers supporting ATOM protocol – SCA Component References Tuscany ATOM Binding

• Exposing Collections as Atom Feeds • Using References to consume external Atom Feeds

For more information about Tuscany ATOM Binding: http://tuscany.apache.org/sca-java-bindingatom.html Tuscany RSS Binding

• RSS provides REST style access to data collections. • Services becomes available as Data Feeds trough RSS Binding • Simplifies consuming and aggregating external feeds • The following RSS are supported – RSS 0.90, RSS 0.91 Netscape, RSS 0.91 Userland, RSS 0.92, RSS 0.93, RSS 0.94, RSS 1.0, RSS 2.0 • Web 2.0 applications can consume these feeds using – Regular Feed Readers supporting RSS – SCA Component References Tuscany JSON-RPC Binding

• Enables remote web browser clients to easily make RPC style calls to server-side SCA components using JSON-RPC. • Supported Client Experience – JavaScript DHTML Applications using JSON-RPC JavaScript Proxy – Dojo

Stock Details

Stock Quote

For more information about JSON-RPC : http://json-rpc.org/ Tuscany JSON-RPC Binding

• Exposing an SCA Pojo component using JSON-RPC var catalogService = new dojo.rpc.JsonService(" /Catalog?smd "); catalogService.get(). addCallback(contentCallBack);

For more information about Tuscany JSON-RPC Binding: http://tuscany.apache.org/sca-java-bindingjsonrpc.html Tuscany Widget

• Model your Web 2.0 application as an SCA component and allows you to define SCA References, that would be wired to server side services. – Supports references using atom and json-rpc and http bindings – Supports properties

Stock Details

Web 2.0 Application

Stock Quote Tuscany Widget

• Defining your Tuscany Widget and adding references to other SCA services

For more information about Tuscany Widget: http://tuscany.apache.org/sca-java-implementationwidget.html Tuscany Widget

• Including generated client proxy • Defining References to SCA Services

For more information about Tuscany Widget: http://tuscany.apache.org/sca-java-implementationwidget.html How do I get Involved ?

• Take a look at Apache Tuscany latest release – http://tuscany.apache.org/ • Most importantly join the active developer and user communities – http://incubator.apache.org/tuscany/getting-involved.html • You are very welcome to get involved in the project in any way you want to, here are some examples. – Try out the software and give us your feedback – Record bugs (JIRA) for any enhancements you want or problems you find – Suggest new extensions – Provide those bits of documentation that you think are missing or can be improved – Write some code – Give a summary of how you have used Tuscany Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml .

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others.

Web 2.0 meets SOA with Apache Tuscany SCA

Introduction

This workshop will describe the necessary steps to create an online store using SCA and Apache Tuscany.

Setup – Installing the latest Tuscany Eclipse plugin

This section shows you how to install the latest Tuscany Eclipse plugin. The plugin gives you the ability to run an SCA composite file from the Package Explorer. You will see a "Run As Tuscany" menu item when you bring up the context menu on composite files.

Start Eclipse and go to Help -> Software Updates -> Find and Install. Select "Search for new features to install" and then click next.

Create a new Remote Site. On the next dialog, click on "New Remote Site...” to create a new site entry. Give it a name such as "Tuscany" and add the site URL as “http://www.apache.org/dist/tuscany/java/sca/1.3.2/tuscany-sca-1.3.2-updatesite/”.

Select the "Remote Site" you just created, and click "Finish"

Select the "Apache Tuscany SCA Tools" and click "Next", and then, on the next dialog, click "Finish". Accept the "Plugin License" and next click on "Install All".

When asked to "restart eclipse", click the "yes" button.

The Store Scenario

The store scenario describes a Fruit Store business that is under expansion and wants to start offering its products online.

The on-line store composite application you will create is a composition of four services.

There is a Catalog service which you can ask for catalog items, and depending on its currency code property configuration it will use a Currency Converter service to provide the item prices in USD or EUR. Then there is the Shopping Cart service into which items chosen from the catalog can be added, it is implemented as a REST service. The Catalog is bound using the JSONRPC binding, and the Shopping Cart service is bound using the ATOM binding. Finally there is the Store user facing service that provides the browser based user interface of the store. The Store service makes use of the Catalog and Shopping Cart service using the JSONRPC, and ATOM binding respectively.

Create a Java project

In this step you create a Java Project in Eclipse to hold the composite service application.

Click on the New Java Project button in the toolbar to launch the project creation dialog.

Next you enter "store" as the Project name , and for Project Layout select Create separate folders for sources and class files.

Hit the Next button, and on the following page go to the Libraries tab. Use the Add Library... button on the right to add the Tuscany Library library to the project.

Hit the Finish button to complete the New Java Project dialog to create the "store" java project.

Construct Services

First you create two package folders into which later in this step you place service implementations.

Select the "store" project and click on the New Java Package button in the toolbar to launch the package creation dialog.

Next you enter "services" as the package Name , and press the Finish button to complete the dialog.

Repeat the previous step to create another package named "ufservices" . The store project now should look as follows.

In the next steps, you will start creating the online-store composite service application and you will use the " services " package to place regular services, and the " ufservices " package for user facing services.

Catalog

In this step you create the Item class, the Catalog service interface and implementation.

Create a Java class in the " services " package named " Item " and copy-paste the following code snippet into it. package services; public class Item { private String name ; private String price ;

public Item() { }

public Item(String name, String price) { this .name = name; this .price = price; }

public String getName() { return name ; }

public void setName(String name) { this .name = name; }

public String getPrice() { return price ; }

public void setPrice(String price) { this .price = price; } }

Select the "services" package. Next you click on the dropdown arrow next to the New Java Class button and select the New Java Interface option from the dropdown list. In the dialog enter "Catalog" as the Name of the interface and select the Finish button to complete the dialog. The Java editor will open on the new created Java interface. Replace the content of the editor by copy-paste of the following Java interface code snippet. package services; import org.osoa.sca.annotations.Remotable;

@Remotable public interface Catalog { Item[] get(); }

Select the "services" package again. Select the New Java Class button . In the dialog enter "CatalogImpl" as the Name of the class, add "Catalog" as the interface this class implements, and then select Finish to complete the dialog.

The Java editor will open on the new created Java class. Replace the content of the editor by copy-paste of the following Java class code snippet. package services; import java.util.ArrayList; import java.util.List; import org.osoa.sca.annotations.Init; import org.osoa.sca.annotations.Property; import org.osoa.sca.annotations.Reference; public class CatalogImpl implements Catalog { @Property public String currencyCode = "USD" ; @Reference public CurrencyConverter currencyConverter ;

private List catalog = new ArrayList();

@Init public void init() { String currencySymbol = currencyConverter .getCurrencySymbol( currencyCode ); catalog .add( new Item( "Apple" , currencySymbol + currencyConverter .getConversion( "USD" , currencyCode , 2.99))); catalog .add( new Item( "Orange" , currencySymbol + currencyConverter .getConversion( "USD" , currencyCode , 3.55))); catalog .add( new Item( "Pear" , currencySymbol + currencyConverter .getConversion( "USD" , currencyCode , 1.55))); }

public Item[] get() { Item[] catalogArray = new Item[ catalog .size()]; catalog .toArray(catalogArray); return catalogArray; } }

After completing these steps the content of the "store" project will look as follows.

Note: CatalogImpl is red x'ed because it makes use of the CurrencyConverter interface that we have not implemented yet.

Currency Converter

In this step you create the CurrencyConverter service interface and implementation. You follow the same steps that you learned previously to create the interface and implementation. First create a Java interface in the "services" package named " CurrencyConverter " and copy-paste the following Java interface code snippet into it. package services; import org.osoa.sca.annotations.Remotable;

@Remotable public interface CurrencyConverter { public double getConversion(String fromCurrenycCode, String toCurrencyCode, double amount);

public String getCurrencySymbol(String currencyCode); }

Next create a Java class in the " services " package named " CurrencyConverterImpl " and copy-paste the following Java class code snippet into it. package services; public class CurrencyConverterImpl implements CurrencyConverter { public double getConversion(String fromCurrencyCode, String toCurrencyCode, double amount) { if (toCurrencyCode.equals( "USD" )) return amount; else if (toCurrencyCode.equals( "EUR" )) return (( double )Math. round (amount * 0.7256 * 100)) /100; return 0; }

public String getCurrencySymbol(String currencyCode) { if (currencyCode.equals( "USD" )) return "$" ; else if (currencyCode.equals( "EUR" )) return "E" ; //"€"; return "?" ; } }

After completing these steps the content of the " store " project will look as follows.

Shopping Cart

In this step you create the Item model object, the Cart and Total service interfaces and the ShoppingCart service implementation.

You follow the same steps that you learned previously to create the interface and implementation.

Create a Java interface in the " services " package named " Cart " and copy-paste the following code snippet into it. package services; import org.apache.tuscany.sca.data.collection.Collection; import org.osoa.sca.annotations.Remotable;

@Remotable public interface Cart extends Collection {

}

Create a Java interface in the " services " package named " Total " and copy-paste the following code snippet into it. package services; import org.osoa.sca.annotations.Remotable;

@Remotable public interface Total { String getTotal(); }

Create a Java class in the " services " package named " ShoppingCartImpl " and copy- paste the following Java class code snippet into it. package services; import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.UUID; import org.apache.tuscany.sca.data.collection.Entry; import org.apache.tuscany.sca.data.collection.NotFoundException; import org.osoa.sca.annotations.Init; import org.osoa.sca.annotations.Scope;

@Scope ("COMPOSITE" ) public class ShoppingCartImpl implements Cart, Total {

private Map cart ;

@Init public void init() { cart = new HashMap(); }

public Entry[] getAll() { Entry[] entries = new Entry[ cart .size()]; int i = 0; for (Map.Entry e : cart .entrySet()) { entries[i++] = new Entry(e.getKey(), e.getValue()); }

return entries; }

public Item get(String key) throws NotFoundException { Item item = cart .get(key); if (item == null ) { throw new NotFoundException(key); } else { return item; } }

public String post(String key, Item item) { if (key == null ) { key = "cart-" + UUID. randomUUID ().toString(); } cart .put(key, item); return key; }

public void put(String key, Item item) throws NotFoundException { if (! cart .containsKey(key)) { throw new NotFoundException(key); } cart .put(key, item); }

public void delete(String key) throws NotFoundException { if (key == null || key.equals( "" )) { cart .clear(); } else { Item item = cart .remove(key); if (item == null ) throw new NotFoundException(key); } }

public Entry[] query(String queryString) { List> entries = new ArrayList>(); if (queryString.startsWith( "name=" )) { String name = queryString.substring(5); for (Map.Entry e : cart .entrySet()) { Item item = e.getValue(); if (item.getName().equals(name)) { entries.add( new Entry(e.getKey(), e .getValue())); } } } return entries.toArray( new Entry[entries.size()]); }

public String getTotal() { double total = 0; String currencySymbol = "" ; if (! cart .isEmpty()) { Item item = cart .values().iterator().next(); currencySymbol = item.getPrice().substring(0, 1); } for (Item item : cart .values()) { total += Double. valueOf (item.getPrice().substring(1)); } return currencySymbol + String. valueOf (total); } }

Note : Since the Tuscany conversational support is not ready yet the cart is realized through a hack. The cart field is defined as static.

After completing these steps the content of the " store " project will look as follows.

Store

In this step you create the user facing Store service that will run in a Web browser and provide the user interface to the other services you created.

Select the " ufservices " package. Right click to get the context menu, select New , and then File . In the New File dialog enter " store.html " for the File name, and then select Finish to complete the dialog.

The Text editor will open on the new created html file. Replace the content of the editor by copy-paste of the following html snippet.

Store </ title > </p><p><script type ="text/javascript" src ="store.js" ></ script > </p><p><script language ="JavaScript" > </p><p>//@Reference var catalog = new Reference( "catalog" ); </p><p>//@Reference var shoppingCart = new Reference( "shoppingCart" ); </p><p>//@Reference var shoppingTotal = new Reference( "shoppingTotal" ); </p><p> var catalogItems; </p><p> function catalog_getResponse(items) { var catalog = "" ; for (var i=0; i<items.length; i++) { var item = items[i].name + ' - ' + items[i].price; catalog += '<input name="items" type="checkbox" value="' + item + '">' + item + ' <br>' ; } document.getElementById( 'catalog' ).innerHTML=catalog; catalogItems = items; } </p><p> function shoppingCart_getResponse(feed) { if (feed != null ) { var entries = feed.getElementsByTagName( "entry" ); var list = "" ; for (var i=0; i<entries.length; i++) { var content = entries[i].getElementsByTagName( "content" )[0]; var name = content.getElementsByTagName( "name" )[0].firstChild.nodeValue; var price = content.getElementsByTagName( "price" )[0].firstChild.nodeValue; list += name + ' - ' + price + ' <br>' ; } document.getElementById( "shoppingCart" ).innerHTML = list; </p><p> if (entries.length != 0) { shoppingTotal.getTotal(shoppingTotal_getTotalResponse); } } } </p><p> function shoppingTotal_getTotalResponse(total) { document.getElementById( 'total' ).innerHTML = total; } </p><p> function shoppingCart_postResponse(entry) { shoppingCart.get( "" , shoppingCart_getResponse); } </p><p> function addToCart() { var items = document.catalogForm.items; var j = 0; for (var i=0; i<items.length; i++) if (items[i].checked) { var entry = '<entry xmlns="http://www.w3.org/2005/Atom"><title>item' + '' + '' + catalogItems[i].name + '' + '' + catalogItems[i].price + '' + '' + '' ; shoppingCart.post(entry, shoppingCart_postResponse); items[i].checked = false ; } } function checkoutCart() { document.getElementById( 'store' ).innerHTML= '

' + 'Thanks for Shopping With Us!

' + '

Your Order

' + '
' + document.getElementById( 'shoppingCart' ).innerHTML+ '
' + document.getElementById( 'total' ).innerHTML+ '
' + '
' + '' + '
' ; shoppingCart.del( "" , null ); }

function deleteCart() { shoppingCart.del( "" , null ); document.getElementById( 'shoppingCart' ).innerHTML = "" ; document.getElementById( 'total' ).innerHTML = "" ; }

function init() { catalog.get(catalog_getResponse); shoppingCart.get( "" , shoppingCart_getResponse); }

Store

Catalog


Your Shopping Cart


(feed)

After completing these steps the content of the "store" project will look as follows.

The Store Composite - Composing Services

Now that you have all the required service implementations you compose them together to provide the store composite service. The composition is stored in a .composite file.

Select the " src " folder of the " store " project. Right click to get the context menu, select New , and then File . In the New File dialog enter " store.composite " for the File name , and then select Finish to complete the dialog.

The Text editor will open on the new created composite file. Replace the content of the editor by copy-paste of the following composite snippet.

USD

After completing these steps the content of the "store" project will look as follows.

Congratulations you completed your 1st composite service applications, now its time to take it into action.

Using Services – Start your Store Application

In this step you launch and use the store composite service application you created.

First select the " store.composite " file. Right click to get the context menu, select Run As , and then Tuscany . The Tuscany runtime will start up adding the store composition to its domain.

The Eclipse console will show the following messages.

Next Launch your Web browser and enter the following address: http://localhost:8080/store/store.html

You get to the Store user facing service of the composite service application.

You can select items from the Catalog and add them to your Shopping Cart.

Since the ShoppingCart service is bound using the ATOM binding, you can also look at the shopping card content in ATOM feed form by clicking on the feed icon . You get the browsers default rendering for ATOM feeds.

Use the browser back button to get back to the Store page.

And then you can Checkout to complete your order.

DISCLAIMER

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

TRADEMARKS

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml .

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others.

Data Stream Management

Patrick Martin School of Computing, Queen’s University Calisto Zuzarte IBM Toronto Lab Goals of the Workshop

• Highlight application areas for data stream management systems (DSMSs) • Examine the key issues that differentiate DSMSs and data stream mining from their standard counterparts • Propose interesting questions and topics for research in the area of DSMSs Workshop Agenda

1:00 – 1:15: Welcome (Pat Martin) 1:15 – 2:45 Session 1 DSMSs – Overview and Issues Pat Martin (Queen’s U) Aggregating Social Media Nick Koudas (U of Toronto) 2:45 – 3:15 Break 3:15 – 4:45 Session 2 Mining Dynamic Data Streams Yingying Tao (U of Waterloo) Window Query Approximation for Joining Data Streams with Relations Based on Importance Semantics Qiang Zhu (U of Michigan) Data Stream Management Systems – Overview and Issues

Patrick Martin School of Computing Queen’s University Outline of Talk

„ Where is the need for DSMSs?

„ New class of application with continuous streams of data

„ What is a DSMS?

„ What are the issues

CASCON 2008 DSMS Workshop 2 Applications – Financial Analysis

„ Electronic trading is now commonplace

„ Trading volume continues to increase rapidly

„ Algorithmic trading: detect advantageous market conditions, automatically execute trades

„ Eg compute 5-minute rolling average or volume- waited average price (VWAP)

„ Latency is key

„ Real-time visualization

CASCON 2008 DSMS Workshop 3 Applications – Sensor Networks

„ Filter, aggregate and join streams from multiple sensors „ Military command and control, healthcare, manufacturing, climate analysis

„ Eg join streams of temperature reading from weather stations with static tables of geographic data to produce temperature contours on a weather map

CASCON 2008 DSMS Workshop 4 Applications – System Monitoring

„ Large volumes of data produced in real-time

„ Network traffic analysis

„ Eg determine bandwidth used for each source- destination pair grouped by protocol type

„ Problem determination

„ Eg determine top-k tables used by workloads over span of several hours

CASCON 2008 DSMS Workshop 5 Why not a DBMS?

Data Streams DBMS

Data Continuous, Static, persistent transient data data

Queries Continuous, result One-time incrementally execution on updated snapshot of data Execution Approximation & Precise results, adaptability stable query plans

CASCON 2008 DSMS Workshop 6 Data Stream Model

„ Data stream is a continuous ordered sequence of data items

„ Data elements arrive in real-time

„ System has no control over order in which data elements arrive

„ Data streams are potentially unbounded

„ Once a data is processed it is discarded or archived => processed in 1 pass! „ Can be modeled as virtual relations or data objects

CASCON 2008 DSMS Workshop 7 DSMS Reference Architecture [1]

WorkingWorking StorageStorage Query Summary Input Summary Processor Output Buffer StorageStorage Buffer

StaticStatic QueryQuery StorageStorage RepositoryRepository Streaming Streaming Outputs Inputs User Queries

CASCON 2008 DSMS Workshop 8 DSMS Issues – Query Languages

SELECT„ Different RSTREAM(item_id, types proposed bid_price)

FROM„ Relation-based bid [ RANGE – ’10 STREAM, Minutes’ TelegraphCQ SLIDE ’90 Seconds’] „ Object-based – COUGAR, Tribeca WHERE bid_price = „ Procedural(SELECT – MAX(bid_price) Aurora „ Support forFROM bid [ RANGE ’10 Minutes’]

„ Streams and static sourcesSLIDE (relations)’90 Seconds’])

„ Continuous queries

„ Windows

CASCON 2008 DSMS Workshop 9 DSMS Issues – Non-Blocking Operators

„ Streams are infinite so can’t have blocking operators in a query plan

„ 3 approaches to unblocking

„ Windowing

„ Incremental evaluation

„ Exploiting stream constraints (eg use of punctuations)

CASCON 2008 DSMS Workshop 10 DSMS Issues – Approximate Algorithms

„ Compact stream summaries may be stored and approximate queries posed over summaries

„ Methods of generating summaries

„ Counting methods

„ Hashing methods

„ Sampling methods

„ Sketches

„ Wavelet transforms

CASCON 2008 DSMS Workshop 11 DSMS Issues – Sliding Windows

Produce an approximate answer to query by evaluating over a recent sliding window of recent data from stream.

Window operator periodically produces visible sets of tuples. Sets defined by range (width in time or tuples), slide (how often to emit set) and start (when to start emitting sets).

CASCON 2008 DSMS Workshop 12 DSMS Issues – Adaptability

„ Stream operators need to be push-based

„ Cost of query plan may change => need a flexible query plan!

„ Tuple routing - Push tuples one at a time through the operator graph; choose order of operators at runtime

CASCON 2008 DSMS Workshop 13 DSMS Issues – Data Stream Mining

„ Process of applying well-known static mining techniques on data streams

„ Poses new challenges:

„ Algorithms to mine the data with only one pass

„ Understanding relationship between accuracy and amount off data seen

„ Concept drift and evolving models Summary

„ DSMSs required to support new class of data-intensive applications

„ Continuous, unbounded data streams „ Key differentiating features of DSMS:

„ One pass of the data

„ Approximation

„ Continuous queries

„ Adaptability References

1. Golab, L., and Ozsu M. T., “Issues in data stream management”, ACM SIGMOD Record, Vol. 32, No. 2, pp.5-14, June 2003. 2. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J., “Models and Issues in Data Stream Systems”, in Proceedings of the 21st ACM SIGACT-SIGMOD- SIGART Symposium on Principles of Database Systems (PODS), Madison, Wisconsin, June 2002, pp. 1-16. 3. Abadi, D., Carney, D., Cetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., and Zdonik, S., "Aurora: A New Model and Architecture for Data Stream Management", Journal of Very Large Data Bases (VLDB), Vol. 12 No.2, pp.120-139, August 2003. 4. Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., and Widom, J., “STREAM: The Stanford Data Stream Management System”, IEEE Data Engineering Bulletin, Vol. 26 No. 1, 2003. 5. Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S., Madden, S., Raman, V., Reiss, F. and Shah, M., “TelegraphCQ: Continuous Dataow Processing foran Uncertain World”, Proc. Conf. on Innovative Data Syst. Res, 2003, pp. 269-280. Thank you!

CASCON 2008 DSMS Workshop 17 Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others. AGGREGATING SOCIAL MEDIA

Nick Koudas University of Toronto

BLOGOSPHERE SOCIAL NETWORKS MICRO‐BLOGGING WIKIs MESSAGE BOARDS other RSS FEEDS

100M+ KNOWN RSS feeds

100K+ NEW EVERYDAY

DOUBLING EVERY 200 DAYS

200M+ ACTIVE USERS IN FACEBOOK,MYSPACE

2.5M+ TWITTING DAILY WHAT ARE THEY WRITING ABOUT??

PERSONAL LIFE PRODUCT REVIEWS POLITICS TECHNOLOGY TOURISM SPORTS ENTERTAINMENT AND…… THROW DONKEYS, CATS, DOGS,ETC., AT EACH OTHER ON FACEBOOK… WHY SHOULD WE CARE? HUGE DATA REPOSITORY

WILL CONTINUE TO GROW

EXTRACT PUBLIC OPINION

VALUABLE INSIGHTS

SEPARATE VALUABLE INFORMATION

HIGHLY NOISY AND HETEROGENEOUS

DISCLAIMER

• PERSONAL VIEW • HIGH LEVEL CHALLENGES AND OPPORTUNITIES HUGE AMOUNTS OF UNSTRUCTURED TEXT

LARGE COLLECTIONS OF VIDEO IMAGES AUDIO (PODCASTS)

MACHINE CREATED WEBLOGS

MORE THAN HALF OF BLOGSPOT IS SPAM

33% OF WEBSPAM HOSTED AT BLOGSPOT

LOOKS LIKE GENERAL WEB CONTENT…

BUT… TEMPORAL DIMENSION GEOGRAPHICAL ASSOCIATION CONVERSATION

LINKS, COMMENTS, TRACKBACKS, ETC BLOGSCOPE

CRAWLER RUNNING 24x7

TRACKING 30M+ BLOGS (and growing fast; text Audio, video included)

INDEXING 500M+ ARTICLES

AGGREGATION AND PREPROCESSING

INTERACTIVE SEARCH AND ANALYSIS

ANY STREAMING TEXT SOURCE

NEWS

MAILING LISTS

FORUMS

SOCIAL MEDIA www.blogscope.net

Hot Keywords Geo Search

Related Terms

Search Results

Popularity Curve Taiwan Undersea Sumatra Earthquake Hawaii Earthquake Earthquake December 15 2006

March 06 2007 IPHONE ON JAN 09 2007 Curves are usually correlated, except

at one point

The battle by Islamist militia against the Somali forces and Ethiopian troops. On Jan 9, Abdullahi Yusuf arrives in War in Somalia Mogadishu, and US gunships attack Al- 30 qaeda targets.

Seeking Stable Clusters in the Blogosphere, VLDB 2007 www.blogscope.net WebDB 2007 Spatial Bursts

March 2008

WebDB 2007 Computational History

• Given this wealth of information can we automatically identify, understand and explain events? • Observation – Events (natural, political, etc) result in sudden/simultaneous online activity by a set of people that care/are affected etc, about/from them.

WebDB 2007 One more example

Computational History

WebDB 2007 Extensions • BlogScope warehouses – user generated content,News articles, PodCasts • Reasoning about – Influence, authority, trust • Text summarization – Single article – Collections of articles (text results) – Demographics • Sentiment • Link and annotate automatically – Find blogger reactions to a news article and vice versa

WebDB 2007 The Team

• Albert Angel

• Nilesh Bansal

• Michael Mathioudakis

• Nikos Sarkas

WebDB 2007 To Conclude..

• Aggregating social media • A lot of data, challenging questions • A good venue to utilize statistical and scientific data management techniques • Visit us: – www.blogscope.net

WebDB 2007 THANK YOU. QUESTIONS?

Source: xkcd.com WebDB 2007 TECHNIQUES CRAWLS RSS FEEDS

2M NEW POSTS DAILY

Handling Streaming RSS • Past work on XML and streaming XML query processing – Indexing – Matching – Content based routing – [Chen07,06][Cho06,05][Guha06a][Hristidis06][ Yahia06] KEYWORD‐GENERATED TIME SERIES KEYWORD GENERATED TIME SERIES

• Query processing – Mining – Similarity matching and indexing – [Das07,06][Arai07][Koudas06][Tung06][Yu06] • Streaming and Sketching – Computation in small space – Sketching and approximate computation – [Guha06b,Guha06c] DATA EXTRACTION & PROFILING DATA EXTRACTION & PROFILING • Data Extraction/Approximate String Matching – www.cs.toronto.edu/~koudas/projects/spider/ – [Hadj08][Chandel07a][Chandel07b][On07][Koudas0 6a][Koudas06b][Dai06] • String query processing INTERACTIVE APPLICATION

SUB SECOND RESPONSE TIME

HUGE AMOUNTS OF DATA (currently 3.5TB and growing)

FIFTHTEEN THOUSAND UNIQUE IP ADDRESSES DAILY

INTERACTIVE ‘LONG ENGAGEMENT’

SCALABILITY

Some Applications..

• Identifying persistent chatter • Computational History Persistent Chatter 48

Apple iPhone – January 2007 † Jan first week: Anticipation of iPhone release † Jan 9th: iPhone release at Macworld

– Jan 10th: Lawsuit by Cisco – Jan third week: Decrease in chatter about iPhone

www.blogscope.net WebDB 2007 49 Keyword Clusters • When there is a lot of discussion on a event, a set of keywords will become correlated – Elements in this keyword set will frequently appear together – These keywords form a cluster • Keyword clusters are transient – Associated with time a interval – As topics recede, these clusters will dissolve

www.blogscope.net WebDB 2007 50 Clusters ‐ Apple iPhone

• Persistent for 4 days • Chatter drifts – Starts with discussion about Apple in general – Moves towards the Cisco lawsuit Note: All keywords are stemmed www.blogscope.net WebDB 2007 51 Gap in Clusters

• Three clusters are shown for Jan 6, 9 and 10 2007; no clusters were discovered for Jan 7 and 8 (related to this topic) • English FA cup soccer game between Liverpool and Arsenal with double goal by Rosicky at Anfield on Jan 6. The same two teams played again on Jan 9,with goals by Bapista and Fowler

Note: keywords are stemmed www.blogscope.net WebDB 2007 Why Stable Clusters 52 • Information Discovery – Monitor the buzz in the Blogosphere – “What were bloggers talking about in April last year?” • Query refinement and expansion – If the query keyword belongs to one of the cluster • Visualization? – Show keyword clusters directly to the user – Or show matching blogs

www.blogscope.net WebDB 2007 53 Overview of Approach

Efficient algorithm to identify keyword clusters

BlogScope data contains over 14M unique keywords

Applicable to other streaming text sources

Flickr tags, News articles

Formalize the notion of stable clusters

Efficient algorithms to identify stable clusters

BFS, DFS and TA

Amenable to online computation over streaming data

Experimental evaluation

www.blogscope.net WebDB 2007 54 Pipeline

day 1

Cluster graph day 2

day 3 documents Keyword Keyword graph clusters Stable clusters www.blogscope.net WebDB 2007 55 Example Output

• We present results from blog postings in the week of Jan 6th • Around 1100-1500 clusters were produced for each day

Jan 6th: Momofuku Ando, the founder-chairman of Nissin Food Products Co, who was widely known as the inventor of instant noodles, died of heart failure.

www.blogscope.net WebDB 2007 Information Bursts

• When an event of interest to a fraction of individuals, takes place, there is a surge in online activity related to the event • information burst ! • with temporal or spatial scope

WebDB 2007 Information Burst

• Burst Identification: where/when something is interesting focusing on spatial bursts or temporal bursts • Burst Description: using sets of keywords identify what is the burst about • Burst Attribution: to specific sets of Bloggers whom does it interest the most

WebDB 2007 Burst Identification

• Temporal bursts: – Points in time that volume of results relevant to a query demonstrate unusual activity • Spatial bursts: – Number of posts, documents, etc from an area unusually large compared to volume in surrounding areas given past activity

WebDB 2007 Burst Attribution

• Describe the set of bloggers responsible for the burst – E.g., set of individuals responsible for a burst to event ‘euro final’ in toronto are mainly men of ages 20‐30 • Bloggers have profiles (consisting of a set of attributes) – Identify subsets of attributes mainly responsible for the burst

WebDB 2007 Burst Description

• Given a spatio‐temporal burst, attributed to a set of bloggers, can we automatically describe the event that caused the burst? • Identify set of keywords that describe the reason for the burst

WebDB 2007 Grapevine • Goal – Obtain higher level understanding of social media • Methodology – Extract entities and relationships – Correlate streaming social media data with wikipedia articles • Goals: – Advanced query processing • Categories, lists, hierarchies – Highly targeted information discovery • E.g., fully automated newspaper assembly – Text Analytics

WebDB 2007 Describe Bursts using Keyword sets query: ‘saddam hussein’

users click on a burst to get related keywords...

WebDB 2007 Describe Bursts using Keyword sets

‘saddam hussein’ ‘saddam hussein iraq’

...and refine search by adding keywords to initial query if new query is also bursty...... get related keywords for new query WebDB 2007 Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others. Mining Data Streams with Distribution Changes

Yingying Tao University of Waterloo Introduction

Data Stream Model A data stream S is a set of elements , where s is a tuple belonging to the schema of S, and t is the timestamp of the element Continuous (usually with high arrival rate) Unbounded (too large to fit main memory) Dynamic (distribution changes) Introduction

Challenges for mining streaming data Using limited resources such as memory Ability to capture and adapt to the distribution changes High accuracy and efficiency requirement Introduction

Research issues Clustering and classification Frequent itemsets and association rules mining Time series analysis Clustering Time-Changing Streams

Problem statement Identify groups of related elements with similar behavior Decision tree vs. K-means ⌧K-means has large space requirement and (usually) requires random access to the input data ⌧Decision tree is more robust and flexible Three criteria for evaluating a decision tree ⌧Accuracy ⌧Efficiency ⌧Tree size Clustering Time-Changing Streams

Motivations One-pass scan Real-time constraint Continuously changing distributions may greatly impact the efficiency and accuracy of existing tree ⌧Re-building the tree is infeasible ⌧Modify the tree based on the new info is more promising Clustering Time-Changing Streams

Motivations – A simple scenario An international trading company monitoring transactions with currency exchange rates of US dollar vs. Canadian dollar Clustering Time-Changing Streams

Our proposal Distribution change detection ⌧Detecting changes by calculating timestamp distance Decision tree re-alignment ⌧Finding a functionally equivalent tree with higher efficiency Decision tree pruning ⌧Pruning historical nodes by heuristics Clustering Time-Changing Streams

Distribution change detection Detecting changes by calculating timestamp distance

⌧τi – The timestamp of the last element that fell into leaf ci

⌧θi – Total number of elements in leaf node ci

⌧φi – Time density of ci : average timestamp values of all elements in ci

Timestamp distance of a new element in ci:

disti = tk - φ i

If dist i < threshold γ, then distribution change detected Clustering Time-Changing Streams

Decision tree re-alignment Purpose: Let the more frequently visited leaf nodes move up higher (closer to root) Functionally equivalent trees:

Given two decision trees Ds and D’s with nodes Ns = {ds, cs} and N’s = {d’s , c’s }, Ds and D’s are functionally equivalent iff

⌧ds = d’s and cs =c’s

⌧If any element falls into ci following Ds , it will also fall into the same leaf node c’i following D’s Clustering Time-Changing Streams

Decision tree re-alignment (cont’d) Evaluating the efficiency of a tree: Weighting decision tree

⌧Let H be the depth of Ds

⌧hi be the depth of ci

⌧pi be the weight of ci Initialized as 1

Increase by 1 each time a re-alignment is triggered on ci Reduced by half each time window slides through a full

length with no new data arrive in ci Clustering Time-Changing Streams

Decision tree re-alignment (cont’d)

Weight of a decision tree Ds

Ws = ∑(pi * (H – hi +1)) The higher the weight, the more efficient the tree

Goal : To find a functional equivalent tree of Ds with highest weight Dynamic programming for finding optimal weighted binary search trees Clustering Time-Changing Streams

Pruning decision tree Most popular pruning solution: prune smallest nodes when the tree is too large Not good! My proposal: prune nodes with only historical data ⌧Heuristic 1 (eager pruning): Prune nodes with time density φ greater than a certain threshold ⌧Heuristic 2 (lazy pruning): Prune nodes with weight p smaller than a certain threshold Clustering Time-Changing Streams

Summary of experiments The proposed technique can report most of the distribution changes in real time Increasing the sliding window size can increase the number of changes detected, but may reduce recall Larger threshold γ leads to higher recall but lower precision The efficiency of re-aligned tree can be improved by at least a factor of 4 Can be applied on real data sets with good performance Mining frequent itemsets

Problem statement Transaction-based data stream

⌧I = {i1 , i2 ,…, in} is a set of items

⌧T = {T1 , T2, …,Tt} is a set of transactions

⌧Each transaction Tj accesses a subset of I ⌧Data stream S = Mining frequent itemsets

Problem statement (cont’d) Frequent itemset A I

⌧If A Tj, then Tj supports A ⌧sup(A) – total number of transactions that support A ⌧If S(A) = sup(A)/N > threshold δ A is a frequent itemset – S(A): support of A t T T T T1 2 3 4 sup({Book,CD})=2 Game Book Book Book S({Book,CD})=50% CD CD Game

ATt – Set of frequent itemsets at time t Mining frequent itemsets

Motivation Finding frequent itemsets is a NP-hard problem Impractical to keep track of all itemsets – usually only frequent ones are monitored Dynamic streams – Infrequent itemsets may become frequent ⌧Infrequent itemsets are not monitored ⌧Lost statistics on supports False-positive vs. False negative Mining frequent itemsets

Our proposal Maintain a list of candidates that have potential to become frequent Two tumbling windows model:

Maintanance window WM and prediction window WP Updating frequent itemsets list and candidate list False negative oriented Mining frequent itemsets

Tumbling windows model

Maintenance window WM and prediction window WP

⌧WM – maintain current frequent itemsets

⌧WP – predict candidates (virtual window) W’ W S t M M

WP

W’P

Frequency count ⌧A counter for each item, frequent itemset, and candidate itemset Mining frequent itemsets

Predicting candidates Candidate itemset A : If threshold θ< sup(A) <δ, then A is a candidate itemset Many of the existing technique are Apriori-like ⌧if an itemset is frequent, then all its immediate supersets are candidate ⌧Problems: 1) Large frequent itemsets take too long to detect 2) Small frequent itemsets may be missed due to distribution changes

ATt= {{a},{b},{a,b}}{{a},{b}}{{a},{b},{c}}{{a},{b},{c},{a,c},{b,c}}{{a},{b},{c},{a,c},{b,c},{a,b,c}} C= {{{a,c},{b,c}}{{a,b,c}} } Mining frequent itemsets

Predicting candidates – Our approach Smallest coverage set

Given itemset list A={A1 , A2 , …,Am}, let {Ai1, Ai2 , …,Aim} A

If Ai1 UAi2 U…UAim = A1 UA2 …UAm, then {Ai1, Ai2, …,Aim} is a coverage set of A, denoted as AC. Among all coverage sets of A, the one with smallest size is denoted as ASC Example: one smallest coverage set of A={{a},{b},{c},{d},{a,b},{a,b,c}} is {{d},{a,b,c}} Mining frequent itemsets

Predicting candidates (cont’d) Hybrid approach: ⌧If itemset A’ is a new frequent itemset, then:

its immediate supersets are candidate SC any A’’ in A Tt , A’ U A’’ is a candidate ⌧If one itemset A in C is no longer a candidate

Remove A from C, and add all A’s immediate subsets into C Mining frequent itemsets

Predicting candidates (cont’d) Property of our hybrid approach Let a be the number of candidate itemsets Let b be the time required for all frequent itemsets in C to be detected

Let |WM | be the length of maintenance window

Let p be the total number of frequent items in ATt For each itemset A with size k turns from infrequent to frequent, we can prove that:

a+2b/ |WM| <= 2p-k Mining frequent itemsets

Eliminating historical effect When windows tumble, update S(A) for all A in candidate list When a itemset A is moved from frequent list to candidate list, reset sup(A)=0 Mining frequent itemsets

Summary of experiments Our approach can detect change in real-time with high accuracy The mining results on streams with faster and more noticeable changes are better than the ones with slow changes. The recall can be improved by decreasing θ, but consume more memory Smaller windows are more sensitive to changes, but may incur high overhead Time series analysis

Problem statement

A = {a1, a2 , … , aP} and B = {b1, b2, …, bQ } are two streams. DA and DB are the underlying distribution of A, B.

Given distance funciont f, dAB = f(DA, DB) is the distance of A and B.

If dAB < threshold δ, then we say A and B match each other, denoted as A B Time series analysis

Motivation Large amount of dynamic streams imply a recursive pattern in their distribution changes Mining results for streams with highly similar distributions should be the same Discovering recursive pattern can help better understanding the nature of the stream Reusing historical mining results for periodically changed streams can improve mining efficiency Time series analysis

Research issues Find proper distance functions Adjust sliding window size for different streams Discover the distribution for current stream Represent the distribution in memory efficient way ⌧Density funcion vs. representative sample set Detect distribtion changes Match patterns and reuse mining results Future Work

Improve our current proposed techniques Clustering with decision tree ⌧Better decision tree re-alignment technique ⌧Determine the relationships of thresholds Frequent itemsets mining ⌧Evaluate the relationships among thresholds ⌧Complexity analysis for finding k-most frequent itemsets Time series analysis ⌧Design a distance function for streams with slow distribution changes Future Work

Develop distribution change-detection techniques independent to mining approaches Propose new techniques for other important stream mining applications Load-shedding techniques for rapid streams Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others. Qiang Zhu The University of Michigan, USA Collaborators:

y Kristine Towne, Adegoke Ojewole —Univ. of Michigan y Calisto Zuzarte — IBM Toronto Lab y Wen‐Chi Hou — Southern Illinois Univ.

RhResearch was partia lly supporttded by IBM CAS and UiUniv. of Michigan Outline: yMotivation yStar‐Streaming Join Queries yOffline and Online Algorithms yExperiments yConclusion Motivation y Modern DB applications => data streams y Data stream: a real‐time, continuous sequence of elements ordered explicitly by time stamp or implicitly by arrival order y Applications: y Sensor network y Network traffic y Financial data y TtiTransaction logs y Typical operations —filtering, aggregation, counting, sampling, order statistics, mining, joins, …, etc y Characteristics of data streams: — Continuous, unbounded, real‐time, …, etc => difficulty with blocking operators, e.g., join

Stream R : Stream S :

r(t) s(t) r(t+1) s(t+1) => sliding windows & sliding window join

Stream R: Stream S:

r(t-w-1) s(t-w-1)

r(t-w) s(t-w) w r(t-1) s(t-1) r(t) s(t)

window size = w tuple lifetime = m count or time-based => approximate query due to limited resource E.g., memory size < 2 ×m (e.g., w < m) — obtain approximate query result => load shedding techniques —evict tuples from window before they expire y Random load shedding [Franklin00,Naughton02,Camey02, et al.]: drop tuples randomly y Semantic load shedding [Das03&05]: drop tuples based on expected match probability —aims to maximize the size of the approximate join result. y Data stream management systems (DSMS) — Dedicated system for special techniques => difficult to support jjgoining streams with conventional relations y Issue: how to support query processing over streams in an existing DBMS

=> new techniques are required Characteristics comparison: DSMS vs. RDMBS

DSMS RDBMS

Continuous input arrival Static data

Frequent updates Frequent reads

Non‐blocking operators Blocking operators

Continuous, incremental evaluation Non‐incremental evaluation

One pass Multiple passes allowed

Real‐time Optional support

Adaptivity: exact/approximate answers Exact answers only Star‐Streaming Join y Join data streams with one (fact) relation

R3 R4

R5

R2 F

R1 … y Data tuple format in streams — ts ─ time stamp; sch ─ schema; imp ─ importance y Star‐streaming join definition (n=2) t j RS (t) = R F S = 2 >< >< U U U j=0 k = j0 f ∈F [{r( j) >< f >< s(k)}U {s( j) >< f >< r(k)}], R, S ─ data streams

F ─ fact relation, j0 =max{0, j-m+1} y Approximation: to maximize the total importance of the output result y Dynamic Relations y Allows insertions and deletions y Tuple format in fact relation — y Active‐time interval — [begin, end) y Output tuple only with active tuples

r( j) >< f >< s(k) is an output tuple ⇔ f .begin ≤ j < f .end and f .begin ≤ k < f .end and r( j).a = f .a and f .b = s(k).b Offline Optimal Approximation Algorithm y Assumption —prior knowledge about data streams y Join memory state ggpraphs y Each stream has a graph: R‐graph & S‐graph y One combined graph for all streams: SA‐graph y Node: a memory state containing one combination of tuples at a time instant y Pre‐filter unmatched stream tuples y Edge: a diidecision to move from one stttate to the next y Edge weight: total increased importance y Decisions: y Admitted y Replace a non‐expired tuple y Not adddmitted y Algorithm ideas y Dynamic programming based y Objective: find a path with maximum total importance y Example: y Stream R: <0,1,5>,<1,0,1>,<2,1,4>,<3,0,8>,<4,2,3>,<5,5,2> y Stream S: <0,1,1>,<1,3,5>,<2,3,2>,<3,8,6>,<4,3,4>,<5,5,3> y Fact relation F: <0,,,[3,[-1,∞)),>, <1, ,,[5,[-1,∞)),>, <0,8,[-1,∞)>, <4,5,[-1,∞)>, <1,3,[-1,5)>, <5,8,[3,∞)> r(0) r(1) Legend r(1) Optimal Approximation Path Semantic Approximation Path r(0) r(2) X r(2)

r(0) r(3) r(5) r(3)

3 r(0) r(1) r(1) r(2) r(1) r(2) r(2) r(3) 1 2 X 5 R Graph X r(0) r(1) r(1) 3 r(2) r(2) r(3) X r(3) r(5) 5 X Start 8 r(0) r(0) r(1) r(2) r(2) r(3) Stop ∅ X X r(1) r(2) r(3) r(3) r(5) X

t = 0 begin t = 1 begin t = 2 begin t = 3 begin t = 4 begin t = 5 begin t = 5 end t = 0 end t = 1 end t = 2 end t = 3 end t = 4 end SA Graph 126r(1) r(2) r(3) s(1) s(2) s(1) s(2) s(3) s(2) s(3)

s(1) X s(2) s(3) s(4) X

s(1) s(2) s(4) s(5)

7 s(1) s(2) s(3) s(2) s(3) s(4) X

X s(1) s(2) s(3) Graph S s(3) s(4) s(5)

Start ∅ s(1)4 s(1) s(2) s(3) s(4) Stop ∅ XXX s(2) s(3) s(4) s(5)

t = 0 begin t = 1 begin t = 2 begin t = 3 begin t = 4 begin t = 5 begin t = 5 end t = 0 end t = 1 end t = 2 end t = 3 end t = 4 end Output Result

Exact Result Output Tuples Importance = 43 Graph R Graph S Graph SA (r(0), s(1)) => <1, 1, 3, 5> (r(2), s(1)) => <2, 1, 3, 4> (r(1), s(1)) => <1, 0, 3, 1> (r(0), s(2)) => <2, 1, 3, 2 > (r(3), s(1)) => <3, 0, 3, 5 > (r(2), s(2)) => <2, 1, 3, 2 > (r(1), s(2)) => <2, 0, 3, 1> (r(3), s(2)) => <3, 0, 3, 2> (r(3), s(3)) => <3, 0, 8, 6> (r(1), s(3)) => <3, 0, 8, 1> (r(5), s(3)) => <5, 5, 8, 2> (r(1), s(4)) => <4, 0, 3, 1> (r(2), s(4)) => <4, 1, 3, 4> (r(3), s(4)) => <4, 0, 3, 4> (r(2), s(5)) => <5, 1, 5, 3>

Optimal Approximation Output Tuples Importance = 38 Graph R Graph S Graph SA (r(0), s(1)) => <1, 1, 3, 5> (r(2), s(1)) => <2, 1, 3, 4> (r(1), s(1)) => <1, 0, 3, 1> (r(0), s(2)) =><2132>> <2, 1, 3, 2> (r(3), s(1)) =><3035>> <3, 0, 3, 5> (r(2), s(2)) =><2132>> <2, 1, 3, 2> (r(3), s(4)) => <4, 0, 3, 4> (r(3), s(2)) => <3, 0, 3, 2> (r(3), s(3)) => <3, 0, 8, 6> (r(2), s(4)) => <4, 1, 3, 4> (r(2), s(5)) => <5, 1, 5, 3>

Semantic Load Shedding Result Output Tuples Importance = 34 Graph R Graph S Graph SA (r(0), s(1)) => <1, 1, 3, 5> (r(2), s(1)) => <2, 1, 3, 4> (r(1), s(1)) => <1, 0, 3, 1> (r(0), s(2)) => <2, 1, 3, 2> (r(3), s(1)) => <3, 0, 3, 5> (r(2), s(2)) => <2, 1, 3, 2> (r(1), s(2)) => <2, 0, 3, 1> (r(3), s(2)) => <3, 0, 3, 2> (r(3), s(3)) => <3, 0, 8, 6> (r(1), s(3)) => <3, 0, 8, 1> (r(1), s(4)) => <4, 0, 3, 1> (r(3), s(4)) => <4, 0, 3, 4> Online Approximation Algorithms y Assumption: no prior knowledge about data streams y Shed tuples based on priorities P(()r):

P(r(t)) = f(r(t).imp, mn(r(t)), expr(r(t)))

r(t).imp — importance mn(r(t)) — matching probability with S stream expr(r(t)) — expiration time y DDnamicynamic pre‐filtering F א P(r(i))= -1 if max{ f.end | f f is a valid match for r(i) } ≤ t ר — Reset at each time instant t y Heuristic‐based algorithms y Static Importance Heuristic (SIMP) P(r(t)) = r(t).imp —Ties are broken by dropping the oldest y Static Importance Probability Heuristic (SIMPROB) P(r(t)) = r(t).imp * mn(r(t)) —Ties are broken by dropping the lowest importance, then the fewest matches, then the oldest y Dynamic Importance Probability Heuristic (DIMPROB) — Dynamically re‐calculate P(r(t)) at each time instant —Ties are broken in the same way as SIMPROB y Dynamic Gain Loss Heuristic (DGL)

If r(i) produces an output at time t:

P(r(i)) = P’(r(i)) + [ r(i).imp * mn(r(i))

* (expr(r(i)) –t) / α ]

If r(i) does not produce any output at time t:

P(((i))r(i)) = P’(((i))r(i)) – β

—Ties are broken in the same way as SIMPROB Experiments y Setup: C++, Zipf, uniform y Offline evaluation

Result Set Importance vs. Join Memory Size for N=5000, w=10

5500 ce nn 4500 RAND 3500 SJA

et Importa 2500 OSSJ SS EXACT 1500

Result Result 500 2 4 6 8 10 Join Memory Size (M) y Performance effect of pre-filtering

Relation Size vs. Running Time for M=10, w=10

70 60

e (sec) 50

mm 40 Pre- filtering 30 No Pre-filtering 20

unning Ti 10 RR 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Relation Size (()D) y Online Evaluation

Result Set Importance vs. Join Memory Size for N=2000, w=50

9300 e

cc 8300 STATIMP 7300 STATIMPPROB 6300 DYNIMPPROB Importan 5300 DYNGLS 4300 RAND

sult Set EXACT

ee 3300 R 2300 20 30 40 50 60 70 80 Join Memory Size (M) Conclusions y There is an increasing demand for supporting stream data in a database system y Joining data streams with relations raises new challenges y Our strategies for incorporating importance semantics and pre‐filtering unmatched tuples are effective y Our offline optimal algorithm and online heuristics are promising y Further research is needed For more details:

• K. Towne, Q. Zhu, C. Zuzarte, W.-C. Hou, “Window Queryygg Processing for Joining Data Streams with Relations”, Proc. of CASCON’07

• AOjA. Ojewol e, Q QZhW. Zhu, W.-CHC. Hou, “Win dow Jo in Approximation over Data Streams with Importance Semantics”, Proc. of ACM CIKM ’06 Than k you!

Questions? Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others. WORKSHOP Best Practices for Developing High Performance Java Applications: A Java Virtual Machine Perspective

Techniques for identifying common performance problems in Java

Nikola Grcevski, IBM Canada Lab [email protected] What’s in this presentation?

• General discussion on performance analysis

• Understanding performance issues of Java code running on IBM J9 Virtual Machine

• Tools and Demos

Algorithms are divine, knowing why code underperforms requires performance analysis What the presentation is not about?

• We will discuss system level performance analysis, not Java source level analysis techniques • We will not dig deep in the CPU profiling capabilities • We’ll not talk about tuning I/O Why is performance analysis important?

• Faster is more - You may need to tune for constraints, limited memory, limited IO • Faster is less – Watts

• Why should developers care? – The free ride is over for some time now

• When does it matter? Setting up the environment for analysis

• Tune your application environment – the best you can

• Maximize your CPU utilization – If you can’t - limit your CPU resources

• Make sure your results are consistent – Measure throughput or time Performance analysis

• What constitutes performance analysis?

• The three dimensions of system performance – The effect – The work – The time Finding time

• Finding where the application spends time is the key

• Profiling tools help us find where the time is

• JVMTI based profilers may not tell the story right Java profilers and performance analysis

• Java JVMTI based profilers are very useful for understanding source level issues – Java code hotspots, object allocation, call-graphs

• But… the Java code runs on a virtual machine which runs on an OS – With typical JVMTI profilers you can’t see beyond the source code level System level performance analysis

• What makes system level profilers special?

• AMD CodeAnalyst, Intel VTune, IBM Performance Inspector, GPL oprofile (Linux only), IBM tprof on AIX

• Do we really need to use a tool like this? Types of profiling data

• OS Timer based profiles – Helps you find where the time is

• CPU Performance Counter profiles – Advanced performance analysis metrics – CPU Cycles is equivalent to time – Counter data may mean nothing without the time

• OS Event Data How does J9 VM look like?

Java-based Application Code

Java-based Calls Java Native Uses 1 of many possible Interface configurations JavaSE Pluggable components that dynamically load Foundation into the virtual machine CDC MIDP Calls CLDC to C libraries Virtual Machine class library Java VM Profiler Class loader Java NI Native Debugger Applications Interpreter Real-time Profiler Exception handler JCL natives JIT Garbage collector OS-specific Calls

Thread model Port Library (file IO, sockets, memory allocation, etc.)

Operating System The performance components we’ll be discussing

• JIT compiled code – Java code • JIT compiler module • The Java Garbage Collector • The threading library • The OS Identifying common problems with VM/System interaction

• Compiled Java code – NOT the java classes, the dynamically generated machine code for the java classes – Should take 80-90% of your profile if your application is well behaved • Identified as Other, Dynamic code, Unknown or properly identified if JVMTI is enabled for the system profiler

If you have >80% in the JIT compiled code, JVMTI analysis makes sense Identifying common problems with VM/System interaction

• More than 10% of the time is spent in the JIT compiler module (j9jit23.dll, j9jit24.dll…) – Likely startup related scenario

• If you are benchmarking make sure your benchmark runs longer

• If it’s your workload try running with -Xquickstart Identifying common problems with VM/System interaction

Excessive time in the JIT module Identifying common problems with VM/System interaction

• More than 5% of the time is spent in the GC module (j9gc23.dll, j9gc24.dll…) – Heavy object allocation – frequent garbage collections • Tune the heap options and GC policies – Try –Xgcpolicy:gencon – Set the heap bounds with –Xms and –Xmx • Generate object allocation profiles and optimize object allocation Identifying common problems with VM/System interaction

Excessive time in the GC module Identifying common problems with VM/System interaction

• More than 1% of the time is spent in the THREAD module (j9thr23.dll, j9thr24.dll…) and lots in the kernel modules – Heavy lock contention

• How does JVM locking work?

• Use IBM Java Health Center https://www14.software.ibm.com/iwm/web/cc/earlyprograms/ibm/ibmmdtjhc/ Identifying common problems with VM/System interaction

Excessive time in thread module and system DLLs Identifying common problems with VM/System interaction Locking report by IBM J9 VM Health Center Identifying common problems with VM/System interaction

• More than 5% of the time is spent in the OS kernel (ntdll.dll, ntoskrnl.exe, vmlinux) – Lots of I/O activity – sockets – Native code memory allocation – Locking and mutexes • Tools for analyzing OS performance problems are somewhat limited – Look for what your application could be doing Identifying common problems with VM/System interaction

Excessive time in kernel modules Identifying common problems with VM/System interaction

Use OS level performance monitoring tools

Windows “perfmon” How long to profile for?

• System timer based profiling – < 10 mins • Java Lock Analyzer – < 2 mins • JVMTI profilers – < 2 mins

• Do you need more than one profile? Summary

• Analyzing Java performance problems goes beyond JVMTI based profilers

• Finding where the time is spent helps you understand the performance issues

• There are many tools (free and for fee) that will help you get the job done right Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others. WORKSHOP Best Practices for Developing High Performance Java Applications: A Java Virtual Machine Perspective

Understanding and Tuning the Garbage Collector

Ryan Sciampacone, IBM Canada Lab What’s in this presentation?

• Garbage Collection – IBM JDK 5 • Tooling • Tuning the IBM JDK Garbage Collector • Problems and Solutions • Tips and Best Practices Garbage Collection – IBM JDK 5 What is Garbage Collection?

Conceptually, garbage collection (GC) creates the illusion of infinite free space – Java has a create (“new”) but no destroy – Applications create objects as needed on the Heap In reality, GC reclaims unused memory back to the free lists – Finds objects that are no longer used – Makes their storage available for allocation What is Garbage Collection?

All garbage collectors follow the same formula • Find all live objects (Mark) – Trace the object graph from a set of known starting points (e.g., Thread stacks). Known as “The Root Set” • Recycle objects not found onto the free list (Sweep) – Objects not visible in the live set are “dead” • Optional: Move objects to reduce fragmentation (Compact) – Free bits of memory here and there create holes – Cannot allocate object even if total free space is sufficient – Converts many small holes into fewer large ones What is Garbage Collection?

IBM Java GC has a number of selectable policies under which it will recycle objects

Why have many policies? Why not just “the best”? – Cannot always dynamically determine what tradeoffs the user/ application are willing to make • Pause time vs. Throughput • Footprint vs. Frequency GC policies in IBM SDK 5.0 How do the policies compare?

-Xgcpolicy:optthruput (and –Xgcpolicy:subpool) Java App GC

Thread 1

Thread 2 Thread 3

Thread n

Time

Default policy. The simplest approach to GC and the best all around performer. Targets workloads that require good throughput or vary the level of heap activity dramatically over the course of a run GC policies in IBM SDK 5.0 How do the policies compare?

-Xgcpolicy:optavgpause Java App GC Concurrent Tracing

Thread 1

Thread 2 Thread 3

Thread n

Time

Similar to optthruput, aims at reducing the large pauses that are normally incurred. Does have an associated performance cost (~5%) Aimed at workloads that do vary their heap activity BUT would prefer lower pauses over straight raw throughput GC policies in IBM SDK 5.0 How do the policies compare?

-Xgcpolicy:gencon Java App Global GC Scavenge GC Concurrent Tracing

Thread 1

Thread 2 Thread 3

Thread n Time

A different strategy more closely related to optavgpause, aims to reduce pause times by focusing on newly created objects during collection activity. An alternative to optavgpause, targets workloads where the Generational Garbage Collection Hypothesis “Objects die young” applies. A number of workloads fall under this category. Motivation Just why does this matter anyway?

• We’ll focus primarily on the Gencon policy – Generational local collection with a partially concurrent global collector • Why a generational + concurrent solution? – For most workloads objects die young • Generational allows a better return on investment (less effort, better reward) – Performance can be close or even better than standard configuration • Focusing efforts on most effective parts of the heap – Reduce large pause times Garbage Collection How the IBM J9 Generational Garbage Collector Works

JVM Heap

Young Generation Old Generation Permanent Space

IBM J9: IBM J9: Sun JVM Only: -Xmn (-Xmns/-Xmnx) -Xmo (-Xmos/-Xmox) -XX:MaxPermSize=nn Sun: Sun: -XX:NewSize=nn -XX:NewRatio=n -XX:MaxNewSize=nn -Xmn • Minor Collection – takes place only in the young generation, normally done through direct copying  very efficient • Major Collection – takes place in the new and old generation and uses the normal mark/sweep (+compact) algorithm Nursery/Young Generation

Nursery/Young Generation

AllocateSurvivor SpaceSpace SurvivorAllocate SpaceSpace

• Nursery is split into two spaces (semi-spaces) – Only one contains live objects and is available for allocation – Minor collections (Scavenges) move objects between spaces – Role of spaces is reversed • Movement results in implicit compaction Nursery/Young Generation Things to remember when tuning

• Objects move in minor collections – Good thing because, • Reduces fragmentation in heap • Improves allocation speed • Relocalization of objects for better co-locality – Bad thing because, • Thrashing cache reducing hardware performance • Time consuming to move object • Scavenge collections scale with processors – Stop-the-World (STW) collection that runs parallel Tooling GC and Memory Visualizer

• Find the tuning/diagnostic information you need more quickly

• Download – URL: http://www.ibm.com/software/support/isa

• Add the JDK documentation – Select the “Updater” tab – Add the “IBM Developer Kit for Java 5.0” documentation set

• Add additional tools – Extensible Verbose Toolkit (EVTK is the old name for the tool) Using GC and Memory Visualizer

• GCMV consumes verbose GC output – Enabled with –verbose:gc option – Direct output to a file with -Xverbosegclog: • Log rotation available with –Xverbosegclog:,, • Verbose GC is XML output describing GC activity – Great for machines, terrible for humans – Tremendous amount of data • Bytes freed, classes unloaded, time taken, allocation size causing failure, etc – GCMV summarizes the data visually GC and Memory Visualizer GC and Memory Visualizer Views on verbose GC GC and Memory Visualizer Flexible and Powerful

Compare logs in same plot Zoom in on areas of interest

Tuning recommendations Tuning the IBM JDK Garbage Collector Heap Sizing Basics

• -Xmx and –Xms are the basic building blocks for tuning the GC – E.g., -Xmx2g –Xms256m – -Xmx : Maximum heap that can be made available – -Xms : Starting and minimum heap available

• #1 rule: Eliminate variables, build on the result – Optthruput (default) to start with (simplest collection policy) – Start with –Xmx == -Xms • E.g., -Xmx2g –Xms2g • Reduces variability of heap resizing during runs – Use resulting verbose GC with GCMV to determine heap consumed post-GC • General rule is keeping 30% free post-GC • Performance requirements and resource constraints will change this

• Using –Xmx alone is also a reasonable approach – Let the GC decide where the best heap levels are for your application – Variability in work load can change performance characteristics in strange ways • Resize at wrong times can affect your performance! Bigger isn’t always Better

Heap

Page Boundaries

• A smaller heap might improve performance – More frequent collections but better locality – Tradeoff – which do you prefer • And to what degree? Heap Resizing Basics

• You’ve decided that resizing is for you! – But how can you control the triggers and degree of change?

• Controlled by heap free percentages – -Xminf : Minimum heap free threshold for expansion – -Xmaxf : Maximum heap free threshold for contraction – E.g., -Xminf0.30 –Xmaxf0.60 • Can also be controlled by time spent in GC – -Xmint : Minimum % time spent in GC threshold for contraction – -Xmaxt : Maximum % time spent in GC threshold for expansion – E.g., -Xmint0.05 –Xmaxt0.13 Heap Resizing Basics

• In all cases, resizings can have constraints – Use –Xmine and –Xmaxe to bound resize calculations – E.g., -Xmine1m –Xmaxe128m • NOTE: Use of 0 implies no limit • Defaults are minimum 1m and maximum unlimited

• Heap resizings target the min/max free heap ratios Runtime Performance Tuning Process for tuning heap settings Tips Set your Give your best • Run your application, Start Performance estimate for Requirements mx and ms analyze heap usage and determine the steady state. Adjust mx and/or Stress test Set your heap size to ms and possibly your the steady state. switch GC application policy • Make sure your heap never pages. Monitor your paging activities. No Analyze GC Profile objects GC profile • A rule of thumb is to behavior If needed is good? keep 30% of your heap free most of Yes the time.

Done Young Generation Basics

• -Xmn controls the nursery size – Locked in value • Use –Xmns and –Xmnx to specify min/max (similar to –Xms and –Xmx) • -Xmn is a shortcut to specifying –Xmns == -Xmnx – Again – Eliminate the variables!

• Default nursery size is 25% of the total heap (-Xmx) – Good starting point for most applications – Recommend keeping nursery locked in during deployment • Resizing only if you are convinced you need it • Smaller area means more sensitive to slight changes in behavior

• Resizing is available if you insist… – Targets nursery GC time as 1-5% of execution time – Appropriate expands or contracts to meet targets – Many factors to consider here • Occupancy of the new space can change from moment to moment in an application – Leaving resizing to chance could leave you out in the cold • If the old generation expands first, it could leave the nursery short on memory • Or vice versa Young Generation Basics

• There is so such thing as too much – Too large of a new space can cause locality issues – Worse is if there isn’t enough tenure space • Make nursery 80% of the heap • What if long lived data set takes up 40% of the heap? – Keep long lived objects in nursery – Inflates GC times – needless copying Large Object Area (LOA) Basics

• What is the LOA? – Area in the heap reserved for large object allocations • Objects greater than 64k in size – Intention: Reduce fragmentation, GC frequency, Compaction Frequency – Defaults to 5% of heap or old generation • Will shrink to 0% if not used

• Recommendation: Leave as is – Most applications don’t over allocate large objects

• But sometimes your application just isn’t normal – -Xloaminimum : Minimum percentage of heap used for LOA – -Xloamaximum : Maximum percentage of heap used for LOA – -Xloainitial : Starting percentage of heap used for LOA – E.g., -Xloaminimum0.10 –Xloamaximum0.20 Problems and Solutions Problem: Long Pause Times

• GC pause times are longer than you’d like? – Try a different GC policy • Optavgpause can help in heaps that have lots of static data • Gencon can help if your workload has many objects that die young – Compaction times the issue? • Increase in LOA? – Are there lots of large objects being allocated which need compacts to satisfy? » Verbose logs will tell you reasons for compaction • Gencon GC policy can help – Implicitly compacting collector • Increase in Heap size • -Xnocompactgc – Generally not recommended – Your heap may be too large, decrease its size • Paging – simple matter of watching the drive light blink • Locality issues – Not only is your application susceptible to this, the GC is as well Problem: Unexpected Out of Memory

• An OutOfMemory (OOM) exception is thrown but there’s plenty of space on the heap – Native memory / resources are exhausted – Take a look at the resulting javacore.txt file • Contains diagnostic information for your application – Number of threads, classes loaded, etc. • Are there too many threads in your system? – Have you exhausted the platform limits? • Are there an excessive amount of classes – Classes consume native memory outside the heap – Holding links to classes can be both a heap leak as well as a native memory leak Problem: Caches causing OOM

• My application keeps many caches that eventually cause OOM exceptions to be thrown – Caches consume memory • Difficult to manage manually • Become to large relative to the rest of your dataset – Make use of Weak References (java.lang.ref.*) • Object references sensitive to visibility – Object is collected if not referenced directly within heap – Make use of Soft References (java.lang.ref.*) • Objects references sensitive to visibility and collection frequency – Stronger than Weak references – Sensitive to heap pressure – purged by GC under tight memory conditions Problem: Gencon Performance Varies

• Gencon performs well, but periodically the GC times are much too high – Nursery sized incorrectly for working set • Perhaps the young generation is too small? • Does a scavenge actually recover memory? – Free bytes in nursery is 0 or close to 0 – Young generation is too large • Rather, old generation is too small • Can’t keep all long lived data in old generation (full) – 0 or very low free bytes in old generation Problem: Application Performance

• The GC times are fine, but the application appears to run slowly – If allocation rates are extremely high, can be pressure on allocation lock mechanism • Use –Xcompactgc to force compaction and reduce fragmentation • Use Gencon policy to force incremental compaction • Use Subpool policy to improve scalability • Reduce size of object allocations – Larger objects are more time consuming to allocate Problem: Migrating from 32-bit to 64-bit

• Java object references have doubled in size – References are pointers, transition from 32-bit to 64-bit doubles cost • Performance penalties : additional cache, TLB misses & paging – Especially severe on cache-constrained hardware • Increase in memory footprint – Bad for applications that use to fit in 32-bit address space – Heap settings of a particular size don’t work with larger pointer sizes on 64-bit • Settings for JVM are usually locked in – Bad for customers who do not like to retune heap settings

• Solution: -Xcompressedrefs – Available in Java6 SR1 • Reduce cost of Objects through pointer compression – Transparent to the Java developer – VM handles the details – Can achieve speeds and footprint comparable to 32 bit Tips and Best Practices Allocating Objects and You

• Allocating large objects frequently can impact performance – Reduce frequency of >64k object allocates (LOA) – Less than 512 bytes is best • Allocation mechanism happiest at objects below this threshold • Don’t alternate allocation of long/short lived objects – Causes fragmentation which requires compactions – Grouping allocation sites with similar lifetimes together can help improve GC performance • Object pooling can work – Balance between capacity, use and percentage of heap consumed Finalization: Your Evil Friend

• Finalization is great – forget about resources! – GC cleans them up for you (or makes appropriate call) • Finalization runs concurrently with your application – Sudden slowdowns in application under heavy finalization • Finalization thread scheduled • “Iceberg” objects – Small consumption on heap, large native memory • E.g., picture – Can cause strange OOM conditions in applications • Gencon : Changes the “order” of finalization – Young objects finalized before old • Detect death of young objects sooner • Hidden dependencies in applications that require finalization to have a particular order can cause issues Questions? Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others. Writing JIT-friendly Java code

Patrick Doyle IBM Toronto Laboratory [email protected]

1 Outline

1. Big picture 2. Technologies – Compilation triggers – Inlining – Devirtualization – Escape analysis 3. Coding guidelines – Objects – Methods – Loops – Synchronization

2 1. The big picture

3 The big picture

• Compilers thrive on proving properties of code – uncertainty makes code slower • Problem: Java has some inherent uncertainty – virtual calls – many small methods – dynamic class loading – exception checks: NULL checks, bound checks • JIT can speculate to overcome these hurdles

4 Big picture: Reducing uncertainty

• Locals are faster than globals – Fields and statics slow; parameters and locals fast • Constants are faster than variables – final is your friend • privates are faster than publics – protected and “package private” just as slow as public • Small methods (≤100 bytecodes) are good • Simple is faster than complex

5 The big picture: When and how to tune performance

Tuning Java code is an art • Don’t optimize without measuring first – Do use profiling tools to identify the important code • Don’t use micro-benchmarks – Do measure real applications running on real data • Don’t hard-code assumptions of what is hot & cold – Do write small methods and let the JIT do its job

6 2. Technologies

7 How the JIT copes with Java’s challenges

1. Challenge: JIT compilation happens at runtime – Solution: JIT focuses optimization effort on important code 2. Challenge: lots of small methods – Solution: inlining aggregates small methods into larger units 3. Challenge: lots of virtual method calls – Solution: devirtualization converts virtual calls to direct calls 4. Challenge: objects are heap-allocated – Solution: escape analysis uses stack allocation if possible

8 1. Compilation triggers

• Sampling thread periodically checks which method is running • TR initially compiles a method if: – it is called a lot (several hundred times), or – sampling indicates it is consuming a lot of time • Initially, a cheap low-opt compile mode (“warm”) • New code only executes when method is next called – Dynamic loop transfer allows transfer of control to a compiled version of the method as it is being interpreted

9 1. Compilation triggers: recompilation

• TR recompiles a method if sampling indicates it’s consuming a lot of time: – ~1%: “hot” – high-opt – ~10%: “scorching” – highest-opt, detailed profiling • To help scorching, TR first makes a temporary instrumented compilation

10 2. Inlining

• Replaces a call with the target method bytecodes – Increases scope of compilation beyond a single method • Inlining is a powerful way to reduce uncertainty • Drawback: can make the code bigger – Longer compile time – Larger i-cache footprint • We can’t just inline everything

11 2. TR inlining heuristics

• Small callee more likely to be inlined – ≤ ~100 bytecodes (~10 lines of Java code) • More likely to inline if it will probably pay off – Calls in loops • More likely to inline if callee can be specialized – Constant arguments – Arguments with more specific type than formal argument • TR currently has no “partial inlining”

12 3. Devirtualization

• Virtual methods pose a problem for the inliner – Can’t be sure which target method to inline • To inline, JIT must devirtualize: – Pick a single target – Transform the virtual call into a direct call • Must have a backup path in case we picked the wrong target!

13 3. Devirtualization pseudocode

if ( guard ) C.foo(r,x,y); ← direct call else r.foo(x,y); ← original virtual call • The guard makes sure the guess was right • The else branch does the original virtual call in case the devirtualized method is the wrong one • C.foo can then be inlined

14 3. Kinds of devirtualization

• Class hierarchy guard – Used when there’s only one possible target method – No overhead – same cost as a call to a static method • Class-test guard – Used when one target class predominates – Same cost as an if statement plus a call to a static method • If neither of these applies, devirtualization fails – Can’t be inlined – Virtual calls slower than static ones – Interface calls slowest of all

15 4. Escape analysis • Finds objects with limited scope – i.e. never “escape” the method in which they were allocated • Sets aside stack space instead of using heap allocation – Better locality • the stack is usually already in the cache • avoids fragmenting the heap with short-lived objects – Less load on the garbage collector • What makes an object “escape”? – Storing it in a field or static – Returning it from a method unlikely to be inlined – Passing it to a method unlikely to be inlined

16 Conclusion

• Introduced four JIT technologies 1. Compilation triggers 2. Inlining 3. Devirtualization 4. Escape analysis • JIT can optimize Java aggressively despite its hurdles

17 3. Java coding guidelines

18 Focus areas

• Objects • Methods • Loops • Synchronization

19 Objects: allocation

• Object allocation is not free! – Objects still need to be initialized • Minimize object allocation – Note that some code implicitly allocates objects • Autoboxing • Immutables such as BigInteger and BigDecimal • String concatenation with ‘+’ creates a StringBuilder – Avoid creating objects in a loop • e.g. Create a single StringBuilder outside the loop

20 Objects: garbage collection • More objects means more frequent GC • Choose the right GC policy – optthruput • Low GC overhead • Leaves objects where they were first allocated – gencon • Better with many short-lived objects • Rearranges objects to improve locality • Release references to dead objects to avoid memory leaks – References from statics or long-lasting objects can prevent an object from being collected until the end of application – Consider NULLing out such references when you know you don’t need the target object anymore

21 Objects: cache locality

• If feasible, move cold fields into a separate class • Compressed references on 64-bit platforms – Eliminates object bloat from 64-bit pointers – Compression overhead depends on max heap size • < 4 GB, very low; < 32 GB, slightly more • Consider arrays instead of linked data structures – Fewer object headers – Array data is contiguous • Allocate objects on the thread that will use them

22 Objects: stack allocation

• If you want stack allocation, try to avoid: – storing the object to a field or static – returning the object from a nontrivial method – passing the object to a nontrivial method – allocating in loops • Remember: escape analysis is not always performed, so don’t rely on it – Interpreted code does no stack allocation – JIT may not run escape analysis at low opt levels

23 Objects: immutable fields

• Immutable fields can be optimized aggressively • Ways to make fields immutable: – Most important: use final wherever appropriate – At high opt, TR will try to detect other immutables • declare fields private and write to them only in constructors • avoid adding unnecessary “setter” methods • final static is a powerful combination – Aside from constant strings, this is the only time the JIT can optimize based on contents of an object on the heap

24 Methods

• Be aware: methods that are invoked only a few times might not be optimized aggressively • To allow JIT perform more aggressive optimization: – Avoid writing big and complex methods • optimization makes methods bigger, and might exceed limits – Break up the task into several small methods – Use final or private methods • Keep data in locals, especially across calls – Accesses of fields and statics are harder to optimize

25 Methods: polymorphism • Virtual method calls: – Declare parameters of a final or “leaf” type • i.e. not Object or some abstract/interface class • Interface method calls: – JIT can help if there are only a few frequently used implementations • JIT produces an if-then-else chain for faster dispatch – When highly polymorphic, abstract classes are faster than interfaces

26 Methods: exceptions and reflection

• Use exceptions only for exceptional cases – Throwing an exception is expensive – Interpreter and JIT strongly assume exception paths are rare

• Use reflection sparingly, especially in loops – Object.getClass is very fast; everything else is slow – Additional levels of abstraction and indirection – Additional objects created implicitly; extra GC overhead

27 Loops

• JIT aggressively optimizes well-behaved loops – Do not modify the loop bound within loop – Increment the loop index by a single value across all paths – Make loops as compact as possible – Use locals instead of fields or statics where possible

• System.arraycopy is well-optimized by the JIT – Use System.arraycopy instead of rolling your own

28 Synchronization

• Costs of synchronization: – Overhead to acquire and release the lock – Serialization • The longer a thread holds a lock, the more likely other threads will have to wait for the lock • Volatiles are expensive, but cheaper than full synchronization – Don’t use them indiscriminately

29 Synchronization: minimizing lock contention

• Avoid synchronization if possible • Use java.util.concurrent where possible • Avoid synchronization on the same lock when you can use different locks for accessing different data • Avoid synchronizing static methods if possible • Make synchronized blocks as short as possible – Move thread-safe code out of the synchronized block • Try to restrict data to a single thread or use java.lang.ThreadLocal to maintain per-thread data

30 Conclusion

• Keep these techniques in mind when writing high performance code • Focus on hotspots in your application – Too much emphasis on performance can compromise your application’s maintainability – Too much emphasis on maintainability can compromise your application’s performance!

31 END

32 Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

33 Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others.

34 WORKSHOP Best Practices for Developing High Performance Java Applications: A Java Virtual Machine Perspective

Developing Scalable Java Applications for Multi-Core Architectures

Daryl Maier, IBM Toronto Lab [email protected] Outline

• Definition and motivation

• Factors that limit scalability within your application

• External factors that limit scalability

• How do you know when you have a problem? What is scalability?

• Scalability refers to the ability of a system to increase throughput as additional resources are applied – E.g., an application server able to service more transactions as more cores are added • Attaining scalability is a system-wide effort, understanding how the entire software stack (app, JVM, OS, …) and hardware can be tuned for performance What is a multi-core architecture?

• A multi-core processor combines multiple independent processing cores into a single package – Cores can share caches or have exclusive caches – Share an interconnect with the rest of the system – Typical number of cores is 2, 4, 8 • Multiple multi-core processors can be interconnected to form larger systems • E.g., POWER6, Cell, modern Intel/AMD Why should you care if your application scales?

• Current hardware design trends say so – Traditional frequency scaling imposing power and heat dissipation problems – Future processor designs are scaling horizontally by adding more cores rather than making individual cores faster • Your application may be re-used • Benefits of writing scalable code – Better leverage the hardware as more cores are added – Better performance Factors That Limit Scalability Within Your Application Software Stack

• Application is written serially with no opportunity for parallelism

• Load imbalance

• Locks and synchronizations

• External influences – E.g., database logging on a perfectly scalable system

• Tracing and console I/O

• Strain on the JVM Create more parallelism in your application

• Goal is to eliminate serial bottlenecks • Organize your application into parallel tasks – Leverage TaskExecutor framework • Change algorithms to increase parallelism • Do not rely on the JVM to discover opportunities – The JVM does not do automatic parallelization or vectorization of user code (yet) – Java class libraries do not exploit vector processor features (e.g., VMX on PowerPC or SSE on x86) Use java.util.concurrency classes

• Building scalable data structures is HARD – Need to deeply understand concurrency and thread models. – Low level hardware details dramatically affect performance – Bugs are typically hard to shake out and irreproducible. • Java now provides very solid building blocks – Uses state-of-the-art concurrency algorithms using non-blocking sync algorithms – Task scheduling (Executor framework) – Concurrent collections (fast and scalable implementations of Map, List, Queue) – Atomic variables (atomic math ops such as increment, test-and-set) – More variety in locking operations (Lock interface, multiple Conditions) • JVM support for Sun unsafe classes improve performance – More low-level JIT improvements coming in Java 5 and 6 to leverage architectural features to implement atomic operations (e.g., getAndDecrement) Other ways to avoid synchronization

• Do not use synchronized container classes (e.g., Map, Vector) – Even performance of single threaded apps affected • Synchronization in static methods is more expensive than instance methods • Consider using volatiles for simple communication, but BEWARE: – No protection from concurrent updates – Performance depends on the target architecture • E.g., long volatiles particularly bad on 32-bit PPC – A good use case is a timestamp counter implemented with long volatile, or a status flag to terminate loops Understand your application’s memory footprint

• Tune your heap settings (GC policy and space geometry) – Many threads using flat heap will begin to show contention on the heap lock • Eliminate excessive object allocations – Increased frequency/duration of GC – Allocating an object requires touching memory, which has cache implications • Data bound applications that touch a diverse number of objects may benefit from large data pages – Characterized by TLB miss rate – JVM will allocate the heap using large data pages to alleviate TLB misses – OS setup required to support; user must specify -Xlp Consider deploying on a 64-bit JVM

• Improves 64-bit heap addressability beyond 32- bits (>3.5GB) – Important for applications that demand a large working set in memory (e.g., databases, object caches) • On X86, allows access to larger register file, wider registers, and efficient 64-bit arithmetic Important 64-bit JVM Caveats

• Requires a 64-bit OS • Less efficient representation than 32-bit – Cache & TLB effects and stress on hardware more pronounced • Increased collection times for larger heaps – Bigger heaps == longer pause times • 31/32-bit JNI natives cannot be called from 64-bit code – Will have an impact on customers migrating to 64-bit – Source code of natives not always available to recompile Compressed References: Motivation

Pointer Size Space Max Heap Efficiency 31 bits 2 GB 1.3 GB (z/OS) 100% 32 bits 4 GB 1.7 GB (Win) 100% 3.2 GB (AIX) 64 bits compressed 4 to 32 GB 4 to 32 GB ~70-100% 64 bits 16 EB 16 EB 50-70%

• Solution: Build a 64-bit VM with near 32-bit efficiency – Use 32-bit values (offsets) to represent object fields – With scaling, between 4 GB and 32 GB can be addressed • Use –Xcompressedrefs option in 64-bit Java6 builds Other programming practices that strain the JVM

• Transitions between VM <-> native code – Ultimately releasing and re-acquiring a VM access lock – JNI call-out dispatch is optimized – JNI call-in dispatch is being improved post Java 6

• Excessive class loading – To produce highly optimized code the JIT maintains a representation of the class hierarchy which must be updated each time a class is loaded External Factors That Limit Scalability Operating System

• Native memory allocation – JNI code, Linux malloc behaves better than Windows – Common when sending and receiving buffers through socket layer

• Windows socket layer – Performs worse than Linux

• Lack of NUMA awareness Hardware

• Shared bus contention accessing memory – Saturation is particularly bad on Intel > 8 cores

• Cache line thrashing – E.g., stores to shared objects

• Insufficient hardware to drive application workload – E.g., lack of GB ethernet

• Lack of physical memory – Excessive paging

• Storage bottlenecks – Lack of RAID or RAM disks – Leverage object caching frameworks (e.g., ObjectGrid)

• Over-committing the hardware – E.g., more threads than cores Understand your host hardware

• Java’s intent is to be write-once, run anywhere

• Significant performance gains can be realized by understanding the capabilities and limitations of the underlying processor and OS – Each deployment may need to be customized for optimal performance

• Tuning can be difficult, and the JVM is investigating ways of making this transparent to the user. Affinitizing JVMs

• Consider affinitizing a JVM to a subset of cores if: – You want to exploit the cache hierarchy available on those cores for performance, typically if the working set fits within the cache memory available. – You’re running on a NUMA system and your JVM memory working set can fit within the physical memory available to the processor node.

• Can affinitize using ‘taskset’, ‘numactl’ on Linux; ‘start’ on Windows Performance in virtualized environments

• Growing trend toward server consolidation for economic and environmental benefits – JVM is supported when run in a VMWare environment

• Important to study your application to ensure it scales and achieves acceptable performance under a virtual machine – Configuration, memory utilization, VMWare processor spread may all impact performance • You may not get the same scaling and performance characteristics as running it natively – JVM performance teams studying how the JVM needs to adapt in such an environment to make this easier for the customer

• IMPACT 2008 presentation – 1557 - IBM WebSphere and virtualization: Performance, scalability and best practices Hints that you may have a scaling problem

• You know your application is not multithreaded

• Contention on locks – Can be found with Java Lock Analyzer

• Low CPU utilization despite driving it as hard as you can

• If the time spent in the kernel time is > 20% End Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports. Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

• Other company, product or service names may be trademarks or service marks of others. Requirements-Driven Business Process Modelling and Performance Management

CASCON Workshop October 29, 2008

Business Processes

• Coordinated chain of activities intended to produce a business result or reach a business goal

• Potential for improvements in organizations through the utilization of Business Process Management (BPM) methodologies and tools

• Still, inadequacies remain. Among others: – Alignment between business goals and business processes – Compliance with legislation

Requirements-Driven BPM and Performance Management, October 29, 2008 2 Workshop Goals

• Propose many elements of an integrated BPM framework – Takes advantage of languages, methodologies and tools recently developed in the requirements engineering community. – Business process monitoring, performance management, and compliance capabilities integrated across the BPM lifecycle. – Integrates different technologies and tools originating from IBM (Eclipse, Cognos 8 BI, Telelogic DOORS RMS) and open source (jUCMNav and OpenOME]). • Participants are expected to help identify the opportunities and limitations of such notations, methodologies, and tools for business process modelling and performance management in organizations.

Requirements-Driven BPM and Performance Management, October 29, 2008 3

Agenda

• 1:00 - 1:15: Introductions Daniel Amyot, SITE, uOttawa • 1:15 - 1:45 • Daniel Amyot is Associate Professor at – ITU-T's User Requirements Notation the University of Ottawa, which he (URN) and jUCMNav joined in 2002 after working for Mitel • 1:45 - 2:30 Networks as a senior researcher in – Business Process Modelling, Analysis software engineering. His research and Monitoring with URN, jUCMNav interests include scenario-based and Cognos 8 software engineering, requirements • 2:30 - 3:15 engineering, business process modeling, aspect-oriented modeling, – Strategic Business Modeling with i* and feature interactions in emerging • 3:15 - 3:30: Break applications. Daniel is Rapporteur for • 3:30 - 4:00 requirements languages at the – Compliance Management with URN, International Telecommunication jUCMNav, and Telelogic DOORS Union, where he leads the • 4:00 - 4:30 development of the User Requirements – Requirements-Driven Design, Notation. He has a Ph.D. and a M.Sc. Configuration, and Adaptation of from the University of Ottawa (2001 Business Processes and 1994), as well as a B.Sc. from • 4:30 - 4:45: Discussion/conclusions Laval University (1992).

Requirements-Driven BPM and Performance Management, October 29, 2008 4 Agenda

• 1:00 - 1:15: Introductions Alireza Pourshahid, Cognos / IBM • 1:15 - 1:45 • Alireza Pourshahid received his M.Sc. – ITU-T's User Requirements Notation degree in E-Business Technologies (URN) and jUCMNav from the University of Ottawa in 2008 • 1:45 - 2:30 and is now working at IBM. Ali also – Business Process Modelling, Analysis recently started his Ph.D. in Computer and Monitoring with URN, jUCMNav Science at the University of Ottawa. His and Cognos 8 main research interests are Business • 2:30 - 3:15 Process and Performance Management, Process Modeling, Trust – Strategic Business Modeling with i* Modeling, and Software Development • 3:15 - 3:30: Break Methodologies. • 3:30 - 4:00 – Compliance Management with URN, jUCMNav, and Telelogic DOORS • 4:00 - 4:30 – Requirements-Driven Design, Configuration, and Adaptation of Business Processes • 4:30 - 4:45: Discussion/conclusions

Requirements-Driven BPM and Performance Management, October 29, 2008 5

Agenda

• 1:00 - 1:15: Introductions Eric Yu, Faculty of Information, UofT • 1:15 - 1:45 • Eric Yu is Associate Professor at the – ITU-T's User Requirements Notation Faculty of Information, UofT. He received (URN) and jUCMNav his Ph.D. in C.S. from UofT in 1995. His interests are in the areas of information • 1:45 - 2:30 systems design, requirements eng., – Business Process Modelling, Analysis knowledge management, enterprise and Monitoring with URN, jUCMNav architecture, software eng., and business and Cognos 8 modeling. His research emphasizes • 2:30 - 3:15 concepts and techniques for modelling – Strategic Business Modeling with i* and systematically analyzing strategic • 3:15 - 3:30: Break relationships among social actors. He serves on the editorial boards of the Int. • 3:30 - 4:00 Journal of Agent Oriented Software – Compliance Management with URN, Engineering, IET Software and the jUCMNav, and Telelogic DOORS Journal of Data Semantics. He is • 4:00 - 4:30 Program Co-chair for the 27th Int. Conf. – Requirements-Driven Design, on Conceptual Modeling (ER’08). Earlier, Configuration, and Adaptation of he held positions in research and Business Processes development labs at Bell and Nortel • 4:30 - 4:45: Discussion/conclusions Networks in Ottawa.

Requirements-Driven BPM and Performance Management, October 29, 2008 6 Agenda

• 1:00 - 1:15: Introductions Liam Peyton, SITE, uOttawa • 1:15 - 1:45 • Liam Peyton, Ph.D., P.Eng., is the – ITU-T's User Requirements Notation principal investigator for the Intelligent (URN) and jUCMNav Data Warehouse laboratory and • 1:45 - 2:30 Associate Professor at the University of – Business Process Modelling, Analysis Ottawa which he joined in 2002 after and Monitoring with URN, jUCMNav spending 10 years as an industry and Cognos 8 consultant specializing in business • 2:30 - 3:15 process automation, performance management, and software – Strategic Business Modeling with i* development methodologies. His • 3:15 - 3:30: Break current research focus is the securing, • 3:30 - 4:00 monitoring and enabling of data sharing – Compliance Management with URN, within business to business networks jUCMNav, and Telelogic DOORS based on model-driven, service • 4:00 - 4:30 oriented architecture in compliance with – Requirements-Driven Design, government regulations. He has Configuration, and Adaptation of degrees from Aalborg Universitet Business Processes (Ph.D. 1996), Stanford University • 4:30 - 4:45: Discussion/conclusions (M.Sc. 1989), and McGill University (B.Sc. 1984).

Requirements-Driven BPM and Performance Management, October 29, 2008 7

Agenda

• 1:00 - 1:15: Introductions Alexei Lapouchnian, Dept. C.S., UofT • 1:15 - 1:45 • Alexei Lapouchnian is a Ph.D. student in – ITU-T's User Requirements Notation the Dept. of Computer Science at UofT, (URN) and jUCMNav under the supervision of Professor John Mylopoulos. His research interests include • 1:45 - 2:30 software engineering for highly – Business Process Modelling, Analysis customizable, adaptable and adaptive and Monitoring with URN, jUCMNav systems, autonomic computing, business and Cognos 8 process modeling, as well as multiagent • 2:30 - 3:15 systems and in requirements engineering. Most of his research is on using intentional – Strategic Business Modeling with i* (goal) and social (i*) models for modeling, • 3:15 - 3:30: Break analysis, and design of systems. He holds • 3:30 - 4:00 an IBM CAS fellowship to study adaptive business processes. Previously, he – Compliance Management with URN, completed his M.Sc. in the Dept. of C.S. at jUCMNav, and Telelogic DOORS York University. He worked on • 4:00 - 4:30 requirements engineering for multiagent – Requirements-Driven Design, systems, particularly on integrating the i* Configuration, and Adaptation of modeling framework with the Cognitive Business Processes Agents Specification Language for requirements engineering under the • 4:30 - 4:45: Discussion/conclusions supervision of Prof. Yves Lespérance.

Requirements-Driven BPM and Performance Management, October 29, 2008 8 Discussion and Conclusions…

• Opportunities? • Limitations? • Way forward?

Requirements-Driven BPM and Performance Management, October 29, 2008 9

4th Int. MCETECH Conference on eTechnologies 4-6 May 2009, Ottawa, Canada

Topics Important Dates • Inter-organizational processes • Dec. 19: Abstracts • Service-Oriented Architecture • Jan. 9: Full papers • Security and trust • Jan 30: Tutorials/Workshop • Middleware and infrastructure • Feb. 13: Notifications services • Mar. 13: Camera-ready • Applications May 4-6: Conference • Open source and open environments http://mcetech.org/ Requirements-Driven BPM and Performance Management, October 29, 2008 10 ITU-T's User Requirements Notation (URN) and jUCMNav

Daniel Amyot [email protected]

CASCON Workshop, October 29, 2008

Agenda

• What is the User Requirements Notation (URN)?

•jUCMNav

•Demo

• Applications

URN and jUCMNav, CASCON Workshop, 2008 p. 2 User Requirements Notation - URN

• URN is a semi-formal, lightweight graphical language for modeling and analyzing requirements in the form of goals and scenarios • Combines two existing notations – Goal-oriented Requirements Language (GRL) – Use Case Map (UCM) • URN models can be used to specify and analyze various types of reactive systems, telecommunications standards, and business processes.

URN and jUCMNav, CASCON Workshop, 2008 p. 3

ITU-T Z.151: URN - Language Definition

• First standardization effort to address explicitly, in a graphical way and in one unified language, goals and scenarios, and the links between them • Part of the ITU family of languages – SDL, MSC, TTCN-3, ASN.1… • Allows systems, software, and requirements engineers to discover and specify requirements for a proposed system or an evolving system • Definition of URN in Rec. Z.151 (under approval) – Metamodel, abstract/concrete syntaxes, semantics…

URN and jUCMNav, CASCON Workshop, 2008 p. 4 Combination of Goals and Scenarios

• Goal-oriented Requirement Language (GRL) – For modelling goals and other intentional concepts – Mainly for non-functional requirements, quality attributes, rationale documentation, and reasoning about alternatives and tradeoffs – Based on i* and on the NFR Framework • Use Case Map (UCM) notation – For modelling scenario concepts – Mainly for operational requirements, functional requirements, and performance and architectural reasoning

• Can such combination be useful for business process analysts?

URN and jUCMNav, CASCON Workshop, 2008 p. 5

URN for BPM Example - Context

• URN model that addresses privacy protection in a hospital environment: – Researchers want access to patient data but the Health Information Custodian (HIC – i.e., the hospital) needs to protect patient privacy, as required by law (PHIPA in Ontario). – The process of accessing databases must ensure privacy. As required by law, a Research Ethics Board (REB) is usually involved in assessing privacy risks for the research protocol proposed by a researcher. –DB administratorsalso want to ensure that DB users are accountable for their acts.

URN and jUCMNav, CASCON Workshop, 2008 p. 6 GRL: Elements

Goal Softgoal Task Resource

Collapsed Belief Actor with Boundary Actor

(a) GRL Elements

URN and jUCMNav, CASCON Workshop, 2008 p. 7

GRL: Links

Contribution Dependency Decomposition

Correlation Means-End

(b) GRL Links

URN and jUCMNav, CASCON Workshop, 2008 p. 8 GRL: Contribution Types

Make Help Some Positive Unknown

Some Negative Break Hurt

(d) GRL Contributions Types

Make Make 100 100 i) Icon only ii) Text only iii) Icon and text iv) Number only v) Icon and number

(e) Representations of Qualitative and Quantitative Contributions

URN and jUCMNav, CASCON Workshop, 2008 p. 9

One Model, Many Diagrams

URN and jUCMNav, CASCON Workshop, 2008 p. 10 Qualitative Model Evaluation

Weakly Weakly Strategy Denied Satisfied Denied Satisfied Set of initial satisfaction levels for some of the intentional elements.

Conflict Unknown None Propagated to the other elements,

(c) GRL Satisfaction Levels and to actors.

URN and jUCMNav, CASCON Workshop, 2008 p. 11

Quantitative Model Evaluation

Strategies and evaluations can also be quantitative ([-100, 100] scale). Hybrid algorithms can also be defined..

URN and jUCMNav, CASCON Workshop, 2008 p. 12 UCM: Path Nodes

Path with Start Point with … … [CO1] Precondition CS and End [CO2] [CS] [CE] Point with Postcondition CE … … … … ……Responsibility [CO3] … … …… Or-Fork with Empty Point Or-Join Conditions ……Direction Arrow … … …… Waiting Place with Condition … … … … [CW] … … and Asynchronous Trigger … … And-Fork And-Join [CTO] … Timer with Timeout Path, …… IN1 OUT1 Dynamic Stub with In-Path ID Conditions, and Synchronous …… [CT] and Out-Path ID Release URN and jUCMNav, CASCON Workshop, 2008 p. 13 …

UCM: Components

Components:

Protected Component

Team Process Object parent: Context-dependent Component

Agent Actor

URN and jUCMNav, CASCON Workshop, 2008 p. 14 UCM: Stubs and Plug-ins

……IN1 OUT1 Static Stub ……IN1 OUT1 Dynamic Stub

……IN1S OUT1 [ST] Synchronizing Stub

IN1 X OUT1 [ST] ……SB URNBlocking and jUCMNav, Stub CASCON Workshop, 2008 p. 15

UCM: Scenario Definitions and Path Traversal

Scenario Definition Set of initial values for the variables used in conditions and responsibilities, combined with start points triggered, and possibly pre/post conditions.

URN and jUCMNav, CASCON Workshop, 2008 p. 16 Message Sequence Charts Visualization of Scenarios as

URN and jUCMNav, CASCON Workshop, 2008 p. 17

URN (Typed) Links and Metadata

• URN links connect any pair of URN model elements • Most frequently, URN links are used to trace … – Actors in GRL models to components in UCM models – Tasks in GRL models to maps or responsibilities in UCM models • Metadata can also be attached to any URN model element

Actor Intentional Map Element

Component Responsibility

URN and jUCMNav, CASCON Workshop, 2008 p. 18 URN for Business Process Modelling?

• For BPM, we need to answer the W5 questions. – URN can answer Where, What, Who, When and Why. • Goal-oriented Requirement Language (GRL) – Business or system goals and rationales (Why) – Tasks (What) –Actors (Who and Where) • Use Case Maps (UCM) – Responsibilities (What) – Components (Who and Where) – Scenarios and causal sequences (When) •GRL & UCM – Link processes to business goals, for traceability, completeness, alignment and compliance

URN and jUCMNav, CASCON Workshop, 2008 p. 19

Agenda

• What is the User Requirements Notation (URN)?

• jUCMNav

•Demo

• Applications

URN and jUCMNav, CASCON Workshop, 2008 p. 20 URN Tool: jUCMNav

• Free (EPL), open-source plug-in for Eclipse • Supports most of GRL and UCM notation elements • 4 GRL evaluation algorithms, with colour highlight • 1 path traversal mechanism, with export to flat UCMs and MSCs •Also: – Integration with DOORS for Requirements Management – Integrated MSC viewer – Performance modelling and export to Core Scenario Model – Extensions for Business Process Modelling and monitoring – Verification of user-defined (OCL) rules – Report generation (RTF, PDF, HTML) – Export of diagrams to various formats –…

URN and jUCMNav, CASCON Workshop, 2008 p. 21

jUCMNav Views and GRL Diagrams

URN and jUCMNav, CASCON Workshop, 2008 p. 22 Other jUCMNav Views and UCM Diagrams

URN and jUCMNav, CASCON Workshop, 2008 p. 23

MSC Viewer

URN and jUCMNav, CASCON Workshop, 2008 p. 24 Agenda

• What is the User Requirements Notation (URN)?

•jUCMNav

• Demo

• Applications

URN and jUCMNav, CASCON Workshop, 2008 p. 25

Agenda

• What is the User Requirements Notation (URN)?

•jUCMNav

•Demo

• Applications

URN and jUCMNav, CASCON Workshop, 2008 p. 26 Typical Usage of URN

• Modelling and documentation – User and system requirements, rationales • Analysis of business goals – Evaluations of alternative requirements or solutions – Discovery of tradeoffs that can optimize the stakeholders’ degree of satisfaction for conflicting goals • Architecture analysis – Based on NFRs and design constraints – Performance analysis • Generation of individual scenarios – Training, documentation – Detection of conflicts – Transformation to MSC and test cases • Reverse-engineering – Abstract dynamic view of existing system

URN and jUCMNav, CASCON Workshop, 2008 p. 27

Some Application Domains Studied…

– Telecommunication and – Access control procedures telephony services – Network protocols – Wireless systems – e-Business applications – Object-oriented software – Supply chain management – Multi-agent systems – e-Health applications – Web applications and Web – Software product lines services – Operating systems – Railway control systems – Information retrieval systems – Embedded systems – Vehicle communication – User interfaces systems –…

URN and jUCMNav, CASCON Workshop, 2008 p. 28 Outlook: What’s Next for URN?

• Support for business process modelling concepts – Key Performance Indicators • Support for additional workflow patterns – Exceptions • Integration of strategies and scenario definitions – For adaptive systems – For business process monitoring • URN-based aspect-oriented modeling • Further formalization of the language • Specification of transformations (MSC, UML, performance…) • Integration with UML (profiles and tools)

URN and jUCMNav, CASCON Workshop, 2008 p. 29

Conclusions

• Modelling with GRL – Intentions, objectives, functional / non-functional requirements, rationales • Modelling with UCM – Scenarios, processes, architectures • While modelling with URN as a whole, goals are operationalized into tasks, and tasks are elaborated in (mapped to) UCM path elements and scenarios – Can guide the selection of an architecture or processes • URN supports the modelling of business processes, their analysis from various angles, and their transformations.

URN and jUCMNav, CASCON Workshop, 2008 p. 30 For More Information

• Virtual Library: – http://www.UseCaseMaps.org/pub/ – Over 210 papers and theses

• jUCMNav: – http://jucmnav.softwareengineering. ca/jucmnav/

URN and jUCMNav, CASCON Workshop, 2008 p. 31

Basic Elements of the GRL Notation

Goal Softgoal Task Resource Weakly Weakly Denied Satisfied Denied Satisfied

Conflict Unknown None Collapsed Belief Actor with Boundary (c) GRL Satisfaction Levels Actor

(a) GRL Elements

Make Help Some Positive Unknown Contribution Dependency Decomposition

Correlation Means-End Some Negative Break Hurt

(b) GRL Links (d) GRL Contributions Types

Make Make 100 100

i) Icon only ii) Text only iii) Icon and text iv) Number only v) Icon and number

(e) Representations of Qualitative and Quantitative Contributions

URN and jUCMNav, CASCON Workshop, 2008 p. 32 Basic Elements of the UCM Notation

Path with Start Point with … … [CO1] Precondition CS and End [CO2] [CS] [CE] Point with Postcondition CE … … … … ……Responsibility [CO3] … … …… Or-Fork with Empty Point Or-Join Conditions ……Direction Arrow … … …… Waiting Place with Condition … … … … [CW] … … and Asynchronous Trigger … … And-Fork And-Join [CTO] … Timer with Conditions, Timeout …… Components: [CT] Path, and Synchronous Release

… Team Process Object ……IN1 OUT1 Static Stub with In-Path ID and Out-Path ID ……IN1 OUT1 Dynamic Stub with In-Path ID and Out-Path ID Agent Actor Synchronizing Stub with IN1 OUT1 [ST] ……S In-Path ID, Out-Path ID, Protected Component and Synchronization Threshold Blocking Stub with In-Path ID, parent: IN1 X OUT1 [ST] Context-dependent ……SB Out-Path ID, Synchronization Threshold, and Replication Indicator Component

URN and jUCMNav, CASCON Workshop, 2008 p. 33

Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

URN and jUCMNav, CASCON Workshop, 2008 p. 34 Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others.

URN and jUCMNav, CASCON Workshop, 2008 p. 35 Business Process Modeling, Analysis and Monitoring with URN, jUCMNav and Cognos 8

Alireza Pourshahid Pengfei Chen Daniel Amyot Michael Weiss Liam Peyton

CASCON Workshop, October 29, 2008

BPM Life Cycle

BPM and jUCMNav, CASCON Workshop, 2008 p. 2 Performance ManagementCoreModels P n UMa,CSO okhp 08p.3 Workshop,2008 CASCON BPM andjUCMNav, P n UMa,CSO okhp 08p.4 Workshop, 2008 CASCON BPM andjUCMNav, Framework Components olPoesHierarchical KPI Goal Process Performance Model

B G i- o d a I ir te e l c A ra t d t io ju io n s n a t n m l e d e e n n a w KPI t t s e e s b l s Val T a s s arget o k e u g n c es i o L r p

BPM and jUCMNav, CASCON Workshop, 2008 p. 5

Problems

BPM and jUCMNav, CASCON Workshop, 2008 p. 6 How to tackle the problem?

BPM and jUCMNav, CASCON Workshop, 2008 p. 7

Problem Elaboration – Discharge Process

Discharge Process Top Level Business Goals

BPM and jUCMNav, CASCON Workshop, 2008 p. 8 Problem Elaboration – Discharge Process

BPM and jUCMNav, CASCON Workshop, 2008 p. 9

Problem Elaboration – Discharge Process

BPM and jUCMNav, CASCON Workshop, 2008 p. 10 Problem Elaboration – Discharge Process

Process Hierarchies

Business Process Modelling, Analysis and Monitoring with URN, jUCMNav and Cognos 8 - CASCON 2008

BPM and jUCMNav, CASCON Workshop, 2008 p. 11

KPI model in URN

URN BPM

BPM and jUCMNav, CASCON Workshop, 2008 p. 12 Extending URN Meta-model for Performance Monitoring

BPM and jUCMNav, CASCON Workshop, 2008 p. 13

Extending URN Meta-model for Performance Monitoring (Cont.)

BPM and jUCMNav, CASCON Workshop, 2008 p. 14 Architecture

•This really depends on what we want and what are the data sources we need to use.

•We can simply start only with jUCMNav and initialize the values manually.

•We can only use a simple relational data source and generate our Dimensional Data sources from it.

•Using RAAS we might be able to omit the monitoring services

BPM and jUCMNav, CASCON Workshop, 2008 p. 16 Demo

• See online version: • http://jucmnav.softwareengineering.ca/twiki/bin/view/Projet SEG/KPIDemo

BPM and jUCMNav, CASCON Workshop, 2008 p. 17

Future work – Process Redesign

Redesign Patterns Time Cost Quality Flexibility Resource Patterns

Task Patterns Numerical N/A Involvement   Task Elimination   N/A Extra Resource  N/A  Task Composition  N/A  Specialist-Generalist    Task Automation    Generalist-Specialist Routing Patterns N/A  N/A  Resequencing  N/A N/A Empower   N/A Knockout  N/A  External Party Patterns Control Relocation N/A N/A  N/A Integration  N/A  Parallelism    Outsourcing N/A  N/A Triage    Interfacing    Allocation Patterns Contact Reduction   N/A Case Manager N/A  N/A Buffering  N/A N/A Case Assignment  N/A  Trusted Party  N/A N/A Customer Teams  N/A  Integral Business Process Patterns Flexible Assignment  N/A  N/A Case Types   

Resource Technology    N/A Centralization   Exception  N/A  Split Responsibilities  N/A  N/A Case-based Work  N/A N/A

BPM and jUCMNav, CASCON Workshop, 2008 p. 18

Reference: H. A. Reijers, Process-Aware Information Systems. John Wiley & Sons. Inc., 2005 Scenario-based process analysis

Goal Analyze only a Models scenario or a part of the process we are Process Interested in. Models

Observe the impact of highlighted part of Performance Models the process on the

related business Disabled Enabled goals KPI Goal

Sub-Process

Model Link

Scenario

BPM and jUCMNav, CASCON Workshop, 2008 p. 19

Process portfolio analysis Drill down to find the root of problem Performance value is computed from the performance model

Importance value is the average satisfaction level of the top-level business goals when a process performs at its 100% capacity

BPM and jUCMNav, CASCON Workshop, 2008 p. 20 Suggested Steps for the methodology

BPM and jUCMNav, CASCON Workshop, 2008 p. 21

Suggested Steps for the methodology

BPM and jUCMNav, CASCON Workshop, 2008 p. 22 Approval process

Process Model Performance Model Goal Model

BPM and jUCMNav, CASCON Workshop, 2008 p. 23

Process Model

Models KPIs Details Goal Performance Model Model

KPI Groups Technical Review sub-process

BPM and jUCMNav, CASCON Workshop, 2008 p. 25

KPI values in two steps of the scenario

* (10 DW Performance related complaints)

BPM and jUCMNav, CASCON Workshop, 2008 p. 26 Technical Review sub-process - Revised Case study Revised Goal Model Process portfolio model Before After

REB Approval

CPO Review

Technical Review

Importance CPO REB Tech Approval Review Review Review Process Importance 37.6 36.7 37.6 32.67 Before Performance 71 78 60 89

Importance 28.25 27.5 57.75 40 After Performance 50 76 49 55

BPM and jUCMNav, CASCON Workshop, 2008 p. 29

Future Work

BPM and jUCMNav, CASCON Workshop, 2008 p. 30 Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others. Strategic Business Modeling with i*

Eric Yu University of Toronto

Presented at CASCON Workshop on Requirements-Driven Business Process Modelling and Performance Management Markham, Ontario, Canada October 29, 2008

Outline

1. Goal-Oriented and Agent-Oriented Requirements Engineering

2. i* modelling

3. Business modelling for services

4. Strategic business modelling

5. References, i* wiki

2 © Eric Yu 2008 We need deeper understanding than in traditional modelling

Automobile insurance claims example

3 © Eric Yu 2008

… a deeper understanding about processes

„ Car owner wants car to be repaired

„ Insurance company wants to minimize claims payout

„ Car owner wants fair appraisal of repairs

„ Insurance agent wants to maintain good customer relations

4 © Eric Yu 2008 Why is Early RE important?

„ Before defining the system to be built

„ Complex relationships among stakeholders „ what they want „ E.g., security, privacy, trust, profitability, market positioning, strategic alliances, intellectual property, … „ How they can achieve what they want

„ Need systematic method, bring into RE process „ modelling and reasoning support, tools, traceability, …

„ Consider: „ E-business; E-learning; E-health; E-government „ Energy, environment, transportation

5 © Eric Yu 2008

Modelling Strategic Actor Relationships and Rationales - the i* modelling framework

• – have goals, beliefs, abilities, commitments – are semi-autonomous • freedom of action, constrained by relationships with others • not fully knowable or controllable • has knowledge to guide action, but only partially explicit –dependon each other • for goals to be achieved, tasks to be performed, resources to be furnished 6 © Eric Yu 2005 Approach: model social relationships for analysis and design

„ Strategic actors

„ What do I want?

„ How can I achieve what I want?

„ Who do I depend on to achieve what I want?

7 © Eric Yu 2008

Strategic Dependency Relationship

I want I can … …

DDCar Be Repaired Actor A Actor B

8 © Eric Yu 2005 let’s model systems and organizations in terms of Strategic Dependencies among actors

9 © Eric Yu 2008

10 © Eric Yu 2005 11 © Eric Yu 2005

Strategic Rationales about alternative configurations of relationships with other actors – Why? How? How else?

[Yu AOSE01] 12 © Eric Yu 2008 13 © Eric Yu 2008

An Example Meeting Scheduler

From: E. Yu. Towards Modelling and Reasoning Support for Early- Phase Requirements Engineering 3rd IEEE Int. Symp. on Requirements Engineering (RE'97) Jan. 6-8, 1997, Washington D.C., USA. pp. 226-235. Strategic Dependency (SD) model

Attends [Yu RE97] Meeting (p,m)

Exclusion Dates (p) Meeting Meeting Participant Initiator Preferred Dates (p)

ISA Proposed Date (m)

Agreement X (m,p)

Attends Meeting Important (ip,m) Participant Assured (Attends Meeting (ip,m)) Meeting Scheduling Example 15 © Eric Yu 2005

Strategic Rationale (SR) model • Ask “Why”, “How”, “How else”

Meeting Meeting Participant Initiator Attends Meeting Participate Organize InMeeting Meeting Attend Meeting Arrange Meeting Low Convenient Meeting Quick Be Effort Scheduled + (Meeting, Date) Low Quality Agreeable Exclusions (Proposed+ Effort Dates (Meeting, + Schedule Date) Date) User Meeting + Preferred + Friendly Find + Obtain Dates Find Agreeable Avail Min Suitable Date Dates Interruption+ Slot Obtain Proposed Agreement Date AgreeTo Date

Merge Agreement AvailDates 16 © Eric Yu 2005 Scheduling meeting …with meeting scheduler

Attends Meeting (p,m)

MeetingBe EnterAvail Scheduled (m) Dates(m) Meeting Meeting Meeting Participant Initiator Scheduler

Proposed EnterDate Date (m) Range(m) ISA X Agreement (m,p)

Attends Meeting (ip,m) Important Assured Participant (Attends Meeting (ip,m))

17 © Eric Yu 2005

“Strategic Rationale” Model with Meeting Scheduler Meeting Initiator Organize Attends Meeting Meeting Meeting Participant Quick Meeting Low Effort Participate Be InMeeting Scheduled + Attend + Meeting Arrange Schedule Let Convenient Meeting Meeting Scheduler + (Meeting, Date) Schedule Quality Meeting MeetingBe (ProposedDate) Scheduled Agreeable Low EnterDate + (Meeting, Effort Range Richer Date) Schedule Enter Medium + Meeting Date + User Range Friendly Find Find Obtain Agreeable Agreeable Avail Date Find + Slot Dates Using Agreeable Scheduler DateBY Meeting Obtain Proposed Agreement TalkingTo Scheduler Date Initiator Merge Available Agreement AgreeTo Dates Date 18 © Eric Yu 2005 Strategic Rationale Model

Development-World model refers to and reasons about…

Alt-1 Alt-2 To-be As-is Operational-World models

Strategic Dependency Models 19 © Eric Yu 2005

Analysis and Design Support

„ opportunities and vulnerabilities „ ability, workability, viability, believability „ insurance, assurance, enforceability „ node and loop analysis

„ design support „ raising issues „ exploring alternatives „ evaluating, making tradeoffs „ justifying, settling „ based on qualitative reasoning

20 © Eric Yu 2008 Analysis/Evaluation of i* Models [Jennifer Horkoff]

„ To what extent are stakeholder goals satisfied or denied, given a particular situation or design option?

21 © Eric Yu 2008

Example:

„ Evaluation based on an analysis question: „ If the Application implements Restrict Structure of Application Attract

Password, but not Users

p

l

Ask for Secret lp e e H Question, what effect H will this have on Implement Attract Users? Security Usability Password „ Place Initial Labels System reflecting Analysis

lp e e

k rt H Question a u

H M Restrict Ask for Structure of Secret Password Question

22 © Eric Yu 2008 Example: „ Propagate labels „ Resolve labels „ Iterate on the above steps until all labels have been propagated

HumanHuman Intervention Intervention Application Attract

Usability Receives the Users

Attract Users Receives the p

l

lp e e H followingfollowing Labels: Labels: H PartiallyPartially denied denied from from Implement UsabilityRestrict Structure of Security Usability Password Password Partially satisfied from System

lp e e

SecurityPartially denied from Ask k rt H a u H

for Secret Question M Select Label… Restrict Ask for Select Label… Structure of Secret Select partially denied Password Question Select denied 23 © Eric Yu 2008

From Business Models to Service-Oriented Design: a Reference Catalog Approach

Amy Lo Eric Yu Department of Computer Science Faculty of Information Studies University of Toronto University of Toronto

Lo, A. and Yu, E., "From Business Models to Service-Oriented Design: A Reference Catalog Approach", ER 2007: Int. Conf. on Conceptual Modeling, LNCS 4801, Springer, 87-101, 2008. Lo, A., "From Business Models to Service-Oriented Design: A Reference Catalog Approach", M.Sc. thesis, Dept. of Comp. Sci., Univ. of Toronto, 2006. Services at the Business level……

Low Ordering/Delivery Services Price Customer Services Supplier

Retailer Sustainable Growth

Ordering/ Customer Transport Shipping Services Services

Convenience Transport Consolidator

$Profit$$Profit$

Ordering/ Delivery Wholesaler Online Store Online Shopping/ Services Online Shopper Shipping Services

25

Motivations

Service-Oriented Architecture (SOA)

„ Better business/IT alignment Æ Are business needs properly captured?

„ Rapid increase of design options Æ How to choose among them?

„ Open architecture Æ Why? What are the motivations and rationales behind the design?

26 i* as a Business Modeling Technique

Joint value creation

i* Strategic Rationale (SR) Diagram

27

Are all stakeholders goals’ achieved? Evaluation of goal model [Horkoff06]

SR model refined from business model pattern 28 i*’s Analytical and Reasoning Capabilities

„ Goal analysis „ Task decomposition „ Means-ends reasoning „ Alternatives exploration and evaluation „ Feasibility analysis

29

Strategic Reasoning About Business Models: A Conceptual Modeling Approach

Reza Samavi Eric Yu Thodoros Topaloglou

Samavi, R., Yu, E., and Topazoglou, T., "Strategic reasoning about business models: a conceptual modeling approach", Information Systems and E-Business Management. Springer, 2008. DOI:10.1007/s10257-008-0079-z Samavi, R., Yu, E., and Topazoglou, T., "Applying Strategic Business Modeling to Understand Disruptive Innovation", Proc. Int. Conf. on E-Commerce, Innsbruck. Austria. August, 2008. Samavi, R., "Strategic reasoning about business models: a conceptual modeling approach", M.Eng. Project. Dept of Mech. & Ind. Eng., Univ. of Toronto, 2006. What is missing in this business model representation?

Other stakeholders Their goals Motivations Intentions …

31 Image taken from: Weill, P. and M. R. Vitale (2001), Place to space: Migrating to eBusiness Models. Boston: Harvard Business School Press

Use i* to model strategic relationships

I can … I want … Decomposition

Goal Resource Softgoal

Task Means-ends Dependency Contribution/ Correlation 32 Strategic Business Model Ontology

n

Strategy layer o D

ti

la ec

re o

r m

o p

/C o

n s i* + NFR i

o t

i i

t o u n ib tr n o C

Layer Interface Operational layer

i*

33

An Integrated Framework

Elements from old strategy Strategy layer ~ ~ Operational layer ~ ~ t0 t1 Time

State of BM [as-is] Transitional states of BM State of BM [to-be] 34 Case Study

~ Telco is a telecommunication company ~ Arriving cellular voice services has been a technological disruptive innovation for wired voice provider ~ Despite the mobility of cellular phones, • the quality of early wireless voice services was relatively poor, • battery life for cellular units was inadequate, • phones were relatively expensive ~ what circumstances caused co-option in wireless and wired technology ~ what other strategic moves either incumbents or new entrants could have been made in the wireless case.

35 Case study from: Christensen, C. (2004), Seeing What’s Next: Using the theories of innovation to predict industry change. HBS Press Book.

Incumbent.as-is Business Model Incumbent.Signals of change

Incumbent.Strategy

Fit in the

market Novelty

r r O O ? ? Disruptive Efficiency Corporate Incremental Image Lock-in innovation

innovation co-option

d

r n O A ? ? Brand Product Technology Disruptive Time to Image image Leadership Engine market

Capability

r r O O

?Coolness High quality Low Price

Image Image

Image t r

u

H

e

k

t

a r

u M

e

k H a

? M Convenience & High !! Responsiveness ? High Simpler Performance Lower Quality functionality Price rt [Product] u H

lp p e el H H

Integrated Modular Process Process

Value Configuration Is Incumbent capable for Disruptive Innovation? d n

A

d

d n n A A

~ Disruptive Innovation Theory:

M

a

e e

k

k k

e

a a M M

e • Established firms often retreat k a

M M

upmarket in the face of a k disruptive attacks e

ke

a

M

e p k l

e a • They give up their least H M valued customers in pursuit of

higher margins

M a

M k

a e k e

e e

k k

a a e k • Asymmetry of motivation M M a

M

t

r

u

H

e

k a

e

k M

a

t r M

Disruptive attacker develops u M • H

a ke

k a

e e

k

M a skills and continues upmarket M

t r

t u r

u H

H

e e

e k k

k a a

a M M

M

Incumbent.Strategic Choice

-

e e

om S

D

p l

e

H

H

e

l p

D

D D

e k D a M

D

e

k

D a M

s y a l

P

d

n D

A

k

a

e

r B

t r u H

rt u

H D How does i* modeling help?

~ To systematically analyze the business model ~ Investigate multi-stakeholders role in business model (e.g. Rivals, Non-Market Players, etc.) ~ Bring into account intentional dimensions, motivations and goals of participants in a business model ~ Make a firm’s strategy explicit ~ Demonstrate the implication of an strategy

41

the i* Wiki Fostering Investigation, Collaboration, and Evaluation

http://istar.rwth-aachen.de/

–The i* Quick Guide –i* Usage Guidelines –An Overview and a Comparison of i* Tools –Publications listings –Who is Who –Events i* Wiki Team Gemma Grau, Jennifer Horkoff, Dominik Schmitz, Samer Abdulhadi, Eric Yu References

See http://istar.rwth-aachen.de http://www.ischool.utoronto.ca/~yu

Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

44 © Eric Yu 2008 Trademarks

„ Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

„ Other company, product or service names may be trademarks or service marks of others.

45 © Eric Yu 2008 Tracking Compliance for Business Processes Based on URN

Sepideh Ghanavati Daniel Amyot Liam Peyton

CASCON October 29, 2008

Overview • Problem – Complexity of documenting and managing compliance as legislation or business processes change. • Target audience – (Privacy) compliance managers, auditors, lawyers, business process modellers, requirements engineers… • Contributions – Requirements-oriented framework to model legislative compliance for business processes – A meta-model (based on URN) that provides a set of compliance links – A systematic method for tracking and managing compliance as legislation or business processes evolve – Enhancements to existing modelling and traceability tools to support and validate these contributions – Healthcare case study involving an Ontario hospital and privacy law

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Motivation

• Compliance with different regulations is of primary concern for any organization when defining its business processes. – $30B compliance business in 2007 [AMR Research, Feb’07]

• Many organizations, especially in healthcare, use a document- based method to track compliance.

• Document-based methods require much effort to document compliance and manage change, and yet they are usually incomplete.

• Model-based approaches have much potential for change management but are often separated from their source documents, which provide the final authority.

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Three Wishes…

• A framework that can model organizational policies, procedures and legislative documents in the same notation

• Support for useful links: – within views of a model (goals and processes) – between two models (organization and legislation) – between models and legislation and other documents

• A way to manage the evolution of any part (legislation, business processes, etc.) in order to assess the global impact and ensure compliance in the new context

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Related Work

• Not all wishes are granted in existing frameworks!

•Darimontet al. use KAOS to model regulations with goals – No real traceability between processes and legal model • Rifaut et al. apply goal-based models for the compliance of financial systems to Basel II regulations – Does not really provide any kind of traceability •He et al. use ReCAPS to ensure policy- and requirements-compliant systems. – Does not include business processes • Breaux et al. use semantic parameterization to extract rights and obligations from the HIPAA privacy rules. – No links to organization policies and procedures

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Leveraging Models in Designing for Compliance and Performance Management

jUCMNav

DOORS

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Healthcare Case Study

• Policies and procedures for accessing a healthcare data warehouse in a major teaching hospital in Ontario, Canada – Focus on researchers as main information users • Compliance to privacy legislation

• PHIPA: Personal Health Information Privacy Act (Ontario) – Aims to protect privacy and confidentiality of personal health information while facilitating the healthcare provision. – Set of rules for the collection, use and disclosure of personal health information. – 75 sections, amended five times since 2004.

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Example: GRL Model of the Law

Legislation Document Legislation A hospital shall not use the personal Document information of an individual unless a) it has the individual’s consent and b) the information is necessary for a lawful purpose.

Prevent GRL Model source Unauthorized Use source

Hospital Have Have Individual Legal Consent Purpose

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Example: URN Model for an Organization

Prevent Unauthorized Use Completeness issues and inconsistencies could be

Limit Use to detected during modelling… Authorized User

Hospital Have Have Individual Username and Password Consent Component re sp resp

Responsibility

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

EXAMPLE: GRL Model of Organization

10

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Compliance Management Framework • Provides a set of links to connect the policy and procedure documents of an organization to legislation documents • Other links/models provide little return on investment

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Requirements Management Framework

• Each model includes some internal links

– Source Links: Organization GRL and UCM models Æ Policy and procedure documents Legislation GRL model Æ Legislation documents

– Responsibility Links: UCM Model Æ GRL Model (of the healthcare organization)

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Requirements Management Framework

• Between the two models are 3 link sets used to establish and track compliance:

– Traceability Links: • GRL model of organization Æ GRL model of legislation.

– Compliance Links: • GRL model of organization Æ the text document of law

– Responsibility Links: • UCM model of organization Æ GRL model of legislation

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Evaluation of Link Types Links

Traceability Link Compliance Link Responsibility Link

Criteria Responsibilities, Components Softgoals, Goals, Tasks & Granularity Legislative Text (Actors), Maps (Operational Actors Processes) Handle Traceability of Handle Exceptions and Handle Traceability of Functionality Non-Functional Constraints Business Processes Requirements and Tasks Quantity of Many Small Small Manual Links Precision Precise Very Precise Very Precise

Difficulty Moderate Difficult Moderate

Importance of Very Important Not Important Very Important Completeness

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Tool Support

• jUCMNav (A Java – based URN editor) – Allows users to create, load, modify and save a UCM model. – Supports the definition of GRL models. – Supports the creation of links its elements and a UCM model.

• Telelogic DOORS – Supports compliance links (Links between two URN models) – Supports business processes or legislative documents change

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Traceability with Telelogic DOORS

“No change since baseline” Object Text change-bar (green) Object Heading

Project

Folder

Formal Module

Link Module

Suspect Link “Changed this session” “Changed since baseline” Link Indicator change-bar, unsaved (red) change-bar, saved (yellow)

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Compliance Tool Support (In DOORS)

• New Plug-in to the DXL library in Telelogic DOORS

– Setup Compliance Module

– URN Auto Complete Links

– Compliance Auto Complete Links

– Report Changes

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Compliance Tool Support (In DOORS) User Defined URN Menu

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Setup Compliance Module • Add New Formal and Link Modules in Legislation Folder

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Setup Compliance Module • Add New Formal and Link Modules in Organization Folder

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Framework Metamodel

• A metamodel is used to define links between URN models and between each URN model and its source document in the requirements management system (e.g. DOORS)

• It helps to identify which elements of the legislation model are connected to those elements of the organization model.

• It helps to determine which links need to be created manually and which ones can be inferred automatically.

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Framework Metamodel 1..* Organization Model PoliciyProcedureDocument

1..* 1..* Organization refines GRLdiagram UCMmap Metamodel resp 0..*

refines refines source

0..* 0..* 0..1 boundTo ActorRef boundTo Intentional RespRef ComponentRef Stub 0..* ElementRef 0..* 0..1 0..* 0..* 0..* refines 0..* ref ref ref ref 0..* Intentional 0..* Association Responsibility Element Actor 0..* 0..* Component refines 0..* resp 0..* 0..* resp

1 traces Law 3 resp 2 complies Metamodel source 0..* 0..* refines Intentional source Actor Association Clause Definition Element 0..* 0..* refines 1..* 1..* ref ref 0..* 0..* 0..1 boundTo Intentional ActorRef ElementRef 0..* Link legend refines Manual-jUCMNav GRLdiagram LawDocument Manual-DOORS 1..* 1..* Automated-jUCMNav Manual and Legislation Model automated-DOORS

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Compliance Management Framework

• Provides a set of links to connect the policy and procedure documents of an organization to legislation documents • Other links/models provide little return on investment

Organization Model Legislation Model

Policies and Procedure Law and Legislation Documents ink Documents e L nc lia Source Linkmp Source Link Co 2- GRL- Softgoals, Goals, 1- Traceability Link GRL- Softgoals, Goals, Tasks and Actors Tasks and Actors nk Li lity ibi ns Responsibility Link po es - R UCM- Business 3 Processes

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Auto-completion Mechanism

• Responsibility and compliance links (via DXL scripts), e.g.:

traces sources Organization-Actor Legislation-Actor Legislation-Definition

automated

complies

Organization- traces Legislation- sources Legislation-Clause Intentional Element Intentional Element automated

complies

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Auto-Completion Mechanism

• Responsibility Link

Organization- resps traces Component Organization-Actor Legislation-Actor

automated

resps

Organization- resps Organization- traces Legislation- Responsibility Intentional Element Intentional Element automated

resps

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Case Study – PHIPA Compliance at Ontario Hospital

Hospital PHIPA Document PHIPA -HIC: Person or organization who has custody of PHI. HIC Policy Document Document Document - A HIC may disclose PHI to a researcher if he/she, (a) submits: - All requests for data from data warehouse (i) an application, will be evaluated based on technical (ii) a research plan, feasibility, data availability, resource (iii) a copy of REB approval availability and REB approval for research. complies (b) enters into the agreement -Policy 2… …

Protect Privacy and GRL Model Confidentiality of Satisfy Privacy Protect Hospital Data Regulations Confidentiality of Hospital DW Administrator

source GRL Model Ensure Prevent Prevent Unautho- of PHIPA

Accountability source Unauthorized Use source rized Disclosure of Data User REBand Disclosure

Check Check Get to An Limit Disclosure Ethical Request Agreement tr of Data ac UCM Model Issues Form with Data User es HIC And of Hospital Privacy Officer Check Check with Privacy Users re Ask for Ask for Check and Confidentiality sp Safeguards Compliance REB Research Legislations Agreement Approval Plan p p s res e Hospital p And Researcher r s re requestForPHI reviewRequest REB Committee Check Check Accept getToAnAgreement Adequate Ethical X Safeguards Issues V X amendDocuments [NewRequest] getRejection Discrepencies could be Reject X X [GiveUp] detected during modelling…

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Evolution of Privacy Legislation or Business Processes

•Thecompliance links defined in the Requirement Management Framework help to manage the impact of different types of changes and help ensure that compliance is maintained.

Legislation Evolution Business Process Evolution

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Evolution of Privacy Legislation

• Different scenarios by which legislation documents can be amended: – Addition of a New Clause • The clause refers to an existing actor, softgoal, goal or task • It introduces a new actor, softgoal, goal or task

– Modify a Clause with Links

– Delete a Clause with Links

– Modify a Clause without Links

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Example – Amendment to PHIPA Privacy Legislation Model Health Information Custodian Model

Privacy Legislation Document Protect Privacy and Confidentiality of -HIC: Person or organization who has custody of PHI. Hospital Data - A research plan must be in writing and must set out, Prevent a) the affiliation of each person involved in the research s e b) the nature and objectives of the research Unauthorized Use DW Administrator li and Disclosure p c) The safeguards that the researcher will impose m to protect the confidentiality and security of the PHI o REB Ensure c d) A description of the reasonably foreseeable harms Accountability and benefits that may arise from the use of the PHI of Data User

Check source Ethical Issues Review User’s Check The traces Technical Request Form Satisfy Privacy Protect Competency Regulations Confidentiality

Check Privacy Officer Users Safeguards Prevent Unautho- Check with Privacy rized Disclosure and Confidentiality Legislations Limit Disclosure resp of Data p And s HIC e r

Ask for Ask for Check Researcher Hospital Compliance REB Research Agreement Approval Plan submitApplication And out getTheForm filloutTheForm in1 receiveDocuments REB Committee Check Check Adequate Ethical submitREBDocuments Safeguards Issues

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Managing Evolving Business Processes or Policies

• A policy or business process can evolve in 3 ways:

– Modification of an existing process or policy • The existing process or policy has links to its GRL model and to the legislation GRL model • The existing process or policy does not have links to its GRL model or legislation GRL model

– Addition of a new process or policy element

– Removal of a process or policy elements

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Example – Hospital Business Process Changed

(modification of a UCM responsibility)

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Analysis of the Framework • Requirement Management Framework requires less amount of effort in documenting compliance and managing evolution. • It also provides almost the full coverage. Document- Full Compliance Model-based based Framework Models, No Model and Models but No Definition Documents and No Link Link Links Modeling Zero High High Documenting Highest High Low Effort Compliance Managing Highest High Low Evolution Modeling Complete Almost Complete Complete Documenting Almost Zero Incomplete Complete Coverage Compliance Managing Almost Zero Low Almost Complete Evolution Comprehensibility Low High High

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Conclusions and Future Work

• Conclusions – The Full Compliance Framework is a best solution for documenting compliance and responding to change. • Future Work – How best to support a variable number of legislation and regulation documents, which could contradict each other? – How can we build legislation models so they are easily reusable across different areas? – How can we integrate compliance into business process framework: audit trail, monitoring, metrics

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008

Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Trademarks

• Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

• Other company, product or service names may be trademarks or service marks of others.

Tracking Compliance for Business Processes Based on URN CASCON Workshop 2008 Requirements-Driven Design and Configuration Management of BPs

Alexei Lapouchnian (with Yijun Yu, John Mylopoulos)

Department of Computer Science University of Toronto, Canada

RDBPMPM 29.10.2008

Our Aim (1)

z Develop a method for requirements-driven specification and configuration of business processes z At design time: z Ability to explicitly specify, refine and analyze goals of business processes z Ability to model quality constraints related to BPs z Ability to capture alternative ways of achieving the goals of BPs and to model their effects on quality constraints – high-variability models z Generation of high-variability executable BPs based on goal models

RDBPMPM 29.10.2008 2 Our Aim (2) z Fine-grained modeling of BPs based on varying environment characteristics z At runtime: z Traceability to requirements z Configuration of BPs based on current business priorities z Specified at high level in user-oriented terms z Support for BP adaptation based on dynamically changing domain properties z BP adaptation based on BAM

RDBPMPM 29.10.2008 3

Talk Overview

z Requirements-driven configuration of BPs z Using contexts to capture domain variability and its effects on BPs z Towards adaptive BPs

RDBPMPM 29.10.2008 4 Motivation

z A Business Process (BP) is seen as a collection of activities that achieves a certain purpose z However, BPs are usually represented as: z workflows (e.g., BPMN), executable workflows (e.g., WS-BPEL) z Can capture and analyze certain aspects of BPs, but… z Problems exist: z Purpose is rarely explicitly represented z Workflows do not usually capture: z The choices in achieving the goals of BPs z The rationale for sequencing and other decisions, non-functional aspects z Domain properties and how they influence the process z Hard to: z Select process configuration with desired quality characteristics z Configure a process based on the characteristics of the domain z Configure processes at high level, w/o extensive knowledge of modeling notation z Maintenance and adaptation are difficult

RDBPMPM 29.10.2008 5

Supply Customer BP Example

RootRoot QualityQuality Goal Goal GoalGoal Control (Softgoal)(Softgoal) Control FlowFlow Annota-Annota- tions SoftgoalSoftgoal tions ContributionContribution

HelpHelp ( (+ + ) ) MakeMake (++) (++) HurtHurt ( (– – ) ) BreakBreak ( ( – – – – ) )

VariationVariation PointPoint

FunctionalFunctional Goal Goal RDBPMPM 29.10.2008 6 Increasing Precision with Annotations

z Goal models do not capture enough information to be useful in semi-automatic generation of workflows z Annotations to increase precision z Annotations z Control flow (sequence, concurrency, loops, conditions) z External events z Input/output can also be modeled z Annotations – capture domain properties, not design decisions!

RDBPMPM 29.10.2008 7

Increasing Precision with Annotations SequenceSequence

LoopsLoops

ConcurrencyConcurrency

ConditionsConditions

ExternalExternal EventsEvents

RDBPMPM 29.10.2008 8 High-Level BP Configuration z Want to configure BPs based on preferences over quality attributes (softgoals) z Main mechanism to capture variability and to configure goal models: preference-driven OR decompositions or Variation Points z (vs. data-driven or event-driven process variations) z E.g., Ship Order: either Ship Standard or Ship Express z Alternatives achieve the same thing z Difference in softgoal contributions z Softgoals act as selection criteria for choosing BP configurations

RDBPMPM 29.10.2008 9

Variation Points

RDBPMPM 29.10.2008 10 Configuration Option 1

RDBPMPM 29.10.2008 11

Configuration Option 2

RDBPMPM 29.10.2008 12 Generating Executable BP Specifications z Generate WS-BPEL processes from goal models z Semi-automatic z Structurally similar to the source goal models z Variability preservation

RDBPMPM 29.10.2008 13

Generating Executable BP Specifications z Mapping Rules z Leaf-level goals → WS Invocations (or BPEL process invocations) z Higher-level goals → No explicitly mapped. Help in generation of BPEL control flow constructs z Conditions/Loops → BPEL conditions, loops z Event-driven OR → pick w/ multiple onMessage constructs z Softgoals → not mapped z Appropriate portTypes, messages (if I/O is specified), etc. are generated

RDBPMPM 29.10.2008 14 Mapping Variation Points z They are a tool to configure executable (and, possibly, deployed) BPs z Mapping to WS-BPEL z Generate switch (or if-elseif for BPEL 2.0) z Each switch branch corresponds to an alternative z Branch condition (XPath) checks if it is the current choice for the variation point z Activities in each branch are automatically generated based on the particular alternative in the goal model z All choices are to be implemented! z BPEL process skeleton is generated. It needs to be completed. z The result: High-Variability Executable Model (HVEM) of the business process

RDBPMPM 29.10.2008 15

Configuring HVEMs at Runtime

Process Model Configurator Process Model Configurator Model Profile Help Model Profile Help Process Model Profile ProcessSupply_Customer Model ProfileCost_Performance Supply_Customer Cost_Performance Quality Attribute Preferences Quality Attribute Preferences z Prototype infrastructure Cost Rank Cost Rank Launch Process Instance min max 1 Launch Process Instance min max 1 z GUI for preference input Customer Satisfaction Rank Customer Satisfaction Rank min max 3 3 z Configurator WS min max z Can come up with goal model config based on prefs z Can provide the configuration to deployed BPs z XML Representation for BP configuration

RDBPMPM 29.10.2008 16 1 2 G 1 2 G1 G2 [c1] (c1) (c2) G3 G4 G5 G6 [c2]

BP Configuration 44 33

5 5 ShipOrder ShipStandard 6 6 … A deployed generic process gets the appropriate configuration.

Given high-level preferences, theBPM 2007, Brisbane, Australia appropriate configuration is selected 26 September, 2007 17

Discussion z Benefits z Model BPs at high level z Explicit representation of quality attributes (softgoals) z Analysis of alternatives w.r.t. softgoals z Traceability to requirements z BP configuration at high-level, in user-oriented terms z High-level visualization of BPs using goal models is possible z Drawbacks z Explicit representation of BP alternatives z Qualitative analysis

RDBPMPM 29.10.2008 18 Talk Overview

z Requirements-driven configuration of BPs z Using contexts to capture domain variability and its effects on BPs z Towards adaptive BPs

RDBPMPM 29.10.2008 19

Beyond Intentional Variability z Intentional variability cannot capture all the details needed for flexible, adaptable systems. Too high-level.

z How can we model the fact that automatic approval for large orders is very risky, while for others, its just risky?

RDBPMPM 29.10.2008 20 Taking the environment into consideration z Environment (domain) characteristics and their dynamics affect the requirements and must be taken into consideration! We take the intentional view. z Domain characteristics influence the requirements and the behaviour of the system. In particular, they can: z Affect the requirements themselves: what is required of the system, i.e., system goals. z Affect/constrain goal refinement, e.g., change the available alternatives to achieve system goals. z Change how alternatives are evaluated w.r.t. softgoals z Change priorities over quality constraints

RDBPMPM 29.10.2008 21

Our view of context z Properties of the environment that affect requirements are captured using contexts. z Basic definition of a context: z Abstraction over domain assumptions and/or properties. z Many things can be considered to be parts of a context: z Hardware/software configuration z CPU/bandwidth utilization, current system load, etc. z History of user interactions z History of attempted adaptations z Current and recent system/process metric values z … RDBPMPM 29.10.2008 22 Capturing contexts z Domain z Context Entities z Things that the process works on (documents/data, supplies, etc.) or with (users, business partners, legacy systems) z Influence the process (management, owners, government regulators) z Other things (date, time, weather conditions, etc.) z Context entities have context dimensions z Properties that affect the requirements (e.g., size, location, importance, utilization).

RDBPMPM 29.10.2008 23

Context refinement hierarchies z Contexts can be arranged in hierarchies z Structure the domain z Simplify the modeling of the effects that contexts have on the model z Leaf-level contexts are defined to determine when they are active.

RDBPMPM 29.10.2008 24 Adding contexts to goal models

EASS 2008 October 28, 2008 25

More Examples of Context

AND Location(order, international) Get Customs Clearance

AND AND Size(order, substantial) Prepare Apply Paperwork Customs Discount

RDBPMPM 29.10.2008 26 Using contexts z Several ways context can be used: z Process instance configuration to suit the current context. E.g. (the BP example), look at current date/time, order/customer properties, product characteristics, shipping destination, etc.

z Dynamic environment results in dynamic context. Thus, context can be monitored and process instances can be adapted to changing contexts.

RDBPMPM 29.10.2008 27

Supporting Dynamic Contexts z Adaptive systems change in dynamic environments – i.e., when context changes z Need to understand which contexts are static and which are dynamic within a process instance z Sometimes it’s not clear. E.g., customer risk in the presence of multiple process instances. z Change in context entity indicates change in context z Context monitoring using context definitions

RDBPMPM 29.10.2008 28 Talk Overview

z Requirements-driven configuration of BPs z Using contexts to capture domain variability and its effects on BPs z Towards adaptive BPs

RDBPMPM 29.10.2008 29

BP Reconfiguration/Adaptation z Adaptation is done through BP reconfiguration – switching to a better Preferences alternative. Autonomic Manager

z Anticipated adaptations Analyze Plan

External z Analysis and reconfiguration Execute is done at high level in terms Monitor HVGM of goal model alternatives. Feedback Configuration z Causes for reconfiguration: Deployed HVEM z Change in requirements (e.g., business user priorities) z As a response to BP monitoring z Changing environment (context) RDBPMPM 29.10.2008 30 Adaptive BPs

z Must have measurable metrics z Lead/lag indicators z From qualitative to quantitative analysis z Operationalization of softgoals z Mapping to metrics/KPIs z Softgoal desired satisfaction levels => target values or ranges z Monitoring z Can be done in the infrastructure z Sometimes needs to be exposed – e.g., customer satisfaction z Evaluation z Quantitative z Qualitative is still useful!

RDBPMPM 29.10.2008 31

BP Reconfiguration

Create Process 1 Instance 1 Instantiate 22 Deploy 33 Configure 44

Reconfigure 9 9 Process Server/ Select Alternative 88 WfMS Post Events Update 55 Preferences Analyzer/ Decision Monitoring 7 Maker Provide Infrastructure Stakeholder 7 Monitoring Autonomic 66 Manager Results

RDBPMPM 29.10.2008 32 Current/Future Work z Runtime optimization of BPs z From softgoals to metrics/KPIs z Runtime evaluation of BP performance z Context-based adaptation of BPs z Performance contexts z Contexts based on how well targets are being met z Modeling mitigation techniques z From contexts to BRs z Integration of preference-based, context- based, performance-based adaptation

RDBPMPM 29.10.2008 33

[email protected]

RDBPMPM 29.10.2008 34 Disclaimer

The IBM Center for Advanced Studies (CAS) regularly publishes technical documents that are aimed at facilitating an exchange of information. These reports are written by individuals or groups of authors and represent their opinions and do not necessarily reflect those of IBM. In the event of questions or concerns regarding individual reports or presentations, email addresses have been provided for most authors. Alternatively, please feel free to contact CAS at [email protected]. CAS is dedicated to providing a forum for IBM employees, research affiliates, and university students to publish results of their work, their views on technical issues, book reviews, and workshop reports.

Trademarks z Company, product, or service names identified in the text may be trademarks or service marks of IBM or other companies. Information on the trademarks of International Business Machines Corporation in the United States, other countries, or both is located at http://www.ibm.com/legal/copytrade.shtml.

z Other company, product or service names may be trademarks or service marks of Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Monday, Oct 27 Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Registration opens at 7:30 am.

Time Scheduled Events / Activities

8:00am Registration is now closed. CASCON 2009 will take place on November 2-5. 8:30 a.m. - 10:00 a.m. CASCON 2008 Highlights 9:00am Opening Remarks, Best Paper Awards Keynote: John MacDonald, Robarts Institute & Igor Jurisica, Ontario Cancer Institute Highlights and video on Joint Topic: Medical Research Breakthroughs - Enabled by Technology, Location: Grand Richmond Ballroom ITWorldCanada

More info 10:00am Read the highlights article 10:15 a.m. - 11:45 a.m. View (28kb) Technical Papers Sessions 11:00am Locations: Grand Richmond Ballrooms A/B, C, D Get Adobe® Reader®

11:45 a.m. - 1:00 p.m Technology Showcase and Lunch 12:00pm 12:00pm - 12:45pm FoSP Plenary Location: York Ballroom A Speaker: Joanna Ng, IBM Location: York Ballroom C Topic: CAS - From Research Program to Research Commercialization - the complete cycle CASCON 2008 Videos Showcase continues until 5:00 p.m. . Watch full-length videos from 1:00pm 1:00 p.m. - 4:45 p.m. the conference Workshops See Workshops section for details Click here

2:00pm 15 minute break CASCON Proceedings CASCON Proceedings are available on the 3:00pm ACM Digital Library

4:00pm Related information CASCON archives

5:00pm 5:00 p.m. - 6:00 p.m. FoSP Plenary CASCON mailing list Speaker: John Ponzo Location: York Ballroom B Topic: TBA Subscribe/Unsubscribe to CASCON mailing list 6:00pm 6:00 p.m. - p.m. CASCONCamp Location: York Ballroom C

7:00pm

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Monday, Oct 27 Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Registration opens at 8:00 am.

Time Scheduled Events / Activities

8:00am Registration is now closed. CASCON 2009 will take place on November 2-5.

CASCON 2008 Highlights 9:00am 9:00 a.m. - 10:00 a.m. 'Pat Selinger PhD Fellowship' award presentation Highlights and video on Keynote: Alex Nicolaou, Google Topic: Chrome and chromium.org: Google's contribution to browsers ITWorldCanada Location: Grand Richmond Ballroom More info

10:00am Read the highlights article

View (28kb) 10:15 a.m. - 11:45 a.m. 11:00am Technical Papers Sessions Get Adobe® Reader® Locations: Grand Richmond Ballrooms A/B, C, D

11:30 a.m. - 1:00 p.m. 11:45 a.m. - 1:00 p.m 12:00pm Women in Technology Luncheon Technology Showcase and Lunch Locations: York Ballroom B Location: York Ballroom A

CASCON 2008 Videos 1:00pm 1:00 p.m. - 4:45 p.m. Watch full-length videos from Showcase continues until 5:00 p.m. . Workshops the conference See Workshops section for details Click here 2:00pm

CASCON Proceedings 15 minute break 3:00pm CASCON Proceedings are available on the

ACM Digital Library 4:00pm Related information CASCON archives 5:00pm 5:00 p.m. - 6:00 p.m. FoSP Plenary Speaker: Brent Hailpern Location: York Ballroom B CASCON mailing list Topic: Technology and Social Trends in Software Subscribe/Unsubscribe to Development CASCON mailing list

6:00pm 7:00pm

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Monday, Oct 27 Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Registration opens at 8:00 am.

Time Scheduled Events / Activities

8:00am Registration is now closed. CASCON 2009 will take place on November 2-5.

CASCON 2008 Highlights 9:00am 9:00 a.m. - 10:00 a.m. CAS Awards presentation Highlights and video on Keynote: Paul Kedrosky, Kauffman Foundation Topic: Commercialization on a dollar a day (almost) ITWorldCanada Location: Grand Richmond Ballroom More info

10:00am Read the highlights article

View (28kb) 10:15 a.m. - 11:45 a.m. Technical Papers Sessions Get Adobe® Reader® 11:00am Locations: Grand Richmond Ballrooms A/B, C, D

11:45 a.m. - 1:00 p.m Technology Showcase and Lunch 12:00pm 12:00 p.m. - 12:45 p.m. Location: York Ballroom A CASCON 2008 Videos FoSP Plenary Speaker: Nagui Halim, IBM (Topic TBA) Location: York Watch full-length videos from Ballroom C Showcase continues until 2:30 p.m. . the conference

Click here

1:00pm 1:00 p.m. - 4:45 p.m. CASCON Proceedings Workshops See Workshops section for details CASCON Proceedings are available on the 2:00pm ACM Digital Library

15 minute break 3:00pm Related information CASCON archives

4:00pm CASCON mailing list Subscribe/Unsubscribe to CASCON mailing list 5:00pm 6:00pm

7:00pm

About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Location: Grand Richmond Ballroom

9:00 a.m. - Keynote presentation Registration is now closed. CASCON 2009 will take place Alex Nicolaou on November 2-5. Mobile Engineering Manager, Google Topic: Chrome and chromium.org: Google's contribution to browsers CASCON 2008 Highlights Highlights and video on ITWorldCanada Abstract All of us at Google spend much of our time working inside a browser. We search, chat, email and collaborate in a browser. And in More info our spare time, we shop, bank, read news and keep in touch with friends -- all using a browser. Because we spend so much time Read the highlights article online, we began seriously thinking about what kind of browser could exist if we started from scratch and built on the best elements out there. What we really needed was not just a browser, but also a modern platform for web pages and applications, View (28kb) and that's what we set out to build. In this talk we'll overview the features and their implementations, and the challenges they are Get Adobe® Reader® meant to address. This is just the beginning -- Google Chrome is far from done. We're releasing the source to start the broader discussion and hear from you as quickly as possible.

5:00 p.m. - Frontiers of Software Practice Speaker

Brent Hailpern CASCON 2008 Videos Director, Programming Models & Tools,IBM Watch full-length videos from Topic: Technology and Social Trends in Software Development the conference

Click here Abstract One view of the software development space categorizes tools into single-developer environments, small team collaboration, support for worldwide organizations, and tools to deliver business value. Some of these categories are mature, while others are CASCON Proceedings much more exploratory. This categorization, however, does not rest on bedrock. There are technological and societal trends that CASCON Proceedings are will challenge the underlying assumptions and value propositions associated with each of these categories. This presentation will available on the explore those trends and try to understand their implications. ACM Digital Library

Related information CASCON archives

CASCON mailing list Subscribe/Unsubscribe to CASCON mailing list About IBM Privacy Contact Terms of use Country/region [ select ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage

Overview Technology Showcase Technical Papers Workshops Schedule at a Glance Speakers by Day

Tuesday, Oct 28 Wednesday, Oct 29 Thursday, Oct 30

Location: Grand Richmond Ballroom

9:00 a.m. - Keynote presentation Paul Kedrosky Registration is now closed. CASCON 2009 will take place Kauffman Foundation on November 2-5. Topic: Commercialization on a dollar a day (almost) CASCON 2008 Highlights Highlights and video on Abstract ITWorldCanada How to test technologies, people, and markets without breaking the bank or having to deal (too often) with venture capitalists. More info Levon Stepanian Java JIT Compiler Development, IBM Canada Lab Read the highlights article Topic: TBA View (28kb) Get Adobe® Reader® Abstract TBA

12:00 p.m. - Frontiers of Software Practice Speaker Nagui Halim CASCON 2008 Videos Director, Event and Streaming Systems, IBM TJ Watson Research Center Watch full-length videos from Topic: Data Streaming - Real-World, Real-Time the conference Location: York Ballroom C Click here Abstract A 2007 IDC study estimates that the world generated 161 billion gigabytes of digital information, and that the pace of increase in the information we deal with will outstrip our capacity to store it by 2010. All this data -- conversations, television programs, music, CASCON Proceedings movies, stock trades, commodities values, medical images, shopping lists, and test results -- isn't just a statistical artifact. It is the CASCON Proceedings are stuff that drives the scientific, economic, and social engines of our society. The fundamental difference between the computing available on the that most of us do every day, and stream computing. In traditional computing the machine dictates the pace at which things gets ACM Digital Library done. In stream computing, the machine's job is to figure out what's going on in the real world in real time.

Related information CASCON archives

CASCON mailing list Subscribe/Unsubscribe to CASCON mailing list About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 Hotel information Speakers Sponsors

Contacts CASCON Proceedings Full Papers Igor Jurisica, Ontario Cancer Institute (OCI) of the University Health Network CASCON Proceedings are available on the Registration & Sign-in Igor Jurisica is a Senior Scientist at the Ontario Cancer Institute, Associate Professor in the Departments of Computer Science ACM Digital Library High school competition and Medical Biophysics at University of Toronto and Visiting Scientist at IBM's Centre for Advanced Studies. He also holds the position of Canada Research Chair in Integrative Related information Related links Computational Biology. He earned M.Sc. and Ph.D degrees in CASCON archives IBM University Relations computer science from the University of Toronto. Programming Contest Central IBM alphaWorks His research focuses on computational biology and the representation, analysis and visualization CASCON mailing list IBM developerWorks of high-dimensional data from high-throughput biology experiments in the cancer informatics Subscribe/Unsubscribe to DB2 for Academics context. CASCON mailing list WebSphere for Academics Interests include comparative analysis for mining different integrated data sets (e.g., protein- protein interactions, gene expression profiling, high-throughput screens for protein crystallization).

Back to top

John MacDonald, Robarts Research Institute John MacDonald is the Scientific Director of the Robarts Research Institute at the University of Western Ontario, a Professor of Physiology and Pharmacology, and a Fellow of the Royal Society of Canada.

His research is in the field of the regulation of excitatory synaptic transmission and synaptic plasticity by signal transduction pathways. One of Canada's preeminent stroke researchers, Dr. MacDonald joined Robarts from the University of Toronto, where he served as Chair of the Department of Physiology from 2001 to 2008 and ran a research laboratory investigating the cellular basis of neurological conditions such as stroke, pain, Alzheimer's disease, and schizophrenia.

Dr. MacDonald has published over 175 scientific papers and articles, and has trained more than 40 graduate students and post-doctoral trainees.

Back to top

Paul Kedrosky, Kauffman Foundation Paul Kedrosky is a senior fellow at the Kauffman Foundation, a private, nonpartisan foundation that works with partners to advance entrepreneurship in America and improve the education of children and youth. He shares his experience as a technology entrepreneur, venture capitalist, and academic.

Kedrosky is a venture capitalist, media personality, and entrepreneur. He is a sought-after speaker; an analyst for CNBC television; a columnist for TheStreet/RealMoney; the editor of Infectious Greed, one of the best-known business blogs on the Internet; and he is frequently quoted in major publications around the world.

Kedrosky obtained his undergraduate degree in engineering from Carleton University, his MBA from Queen's University, and his Ph.D from the University of Western Ontario.

Back to top Alex Nicolaou, Google Alex Nicolaou joined Google in 2006, shortly after Google opened an office in Waterloo, Ontario, where he is involved in the development of mobile search and email products as a Mobile Engineering Manager. Until 2006, Nicolaou was president and board member of aruna.ca Inc., a startup developing a unique RDBMS based on text-search algorithms and data structures. Prior to that, Nicolaou was part of LiquiMedia Inc., a startup developing a real time kernel extension and Java Virtual Machine.

In 1996, Nicolaou and Jay Steele won the games category of the Java Cup Competition run by Sun Microsystems. In the early '90s, Nicolaou was principal investigator for six algorithm patents. He holds an Honours BMath in Computer Science and Combinatorics and Optimization and an MMath in Computer Graphics, both from the University of Waterloo.

Back to top

Brent Hailpern, IBM Brent Hailpern joined the IBM T. J. Watson Research Center as a Research Staff Member in 1980. In 2004, he became the Department Group Manager for Software Technology (IBM Research/Watson), where he managed departments researching Programming Technology, Software Engineering, and Tools for Non-Programmers. In 2006, he became the Director of Programming Models and Tools for IBM Research.

Dr. Hailpern has authored 15 journal publications and 16 United States patents, along with numerous conference papers and book chapters. He received his B.S. degree, summa cum laude, in Mathematics from the University of Denver in 1976, and his M.S. and Ph.D. degrees in Computer Science from Stanford University in 1978 and 1980 respectively.

Back to top

Joanna Ng, CAS Toronto, IBM Joanna Ng is the Program Director of the Center for Advanced Studies at IBM Toronto Software Laboratory, Canada and a Senior Technical Staff Member with IBM Software Group. She leads the strategy for a team which manages innovative projects between university researchers and IBM software developers.

The Center for Advanced Studies (CAS) provides fellowships for graduate students from Canadian and international universities. Ng has held various senior management and senior architect positions within IBM Software Group. She has also conceived and led various incubation projects in collaboration with IBM Research and nurtured them into commercialized products in the areas of compilers, mobile commerce, voice-enabled portal, commerce portal, retail industry solutions and service-oriented architecture (SOA) asset repository. Ng has been granted fifteen patents from different countries. She is also an IBM Master Inventor.

Back to top

John Ponzo, Distinguished Engineer, Research Client Technologies, IBM T.J. Watson R.C. John Ponzo is an IBM Distinguished Engineer at the IBM T.J. Watson Research Center located in Hawthorne New York. John has over fifteen years of experience in software development. His major areas of interest are Web programming models, middleware, and tools. John has authored Web-based software that has been released in several Lotus, AIM, and Rational products. He is the lead architecture for the IBM Research "Collaboration Services and Infrastructure" initiative focusing on Software as a Service innovations for Lotus Bluehouse.

John currently leads the Sametime Web project focusing on a highly customizable AJAX client and SDK for Sametime instant messaging services. John is the co-author of the 2008 Research Global Technology Outlook topic on Enterprise Mobile and is actively working on collaborative applications for emerging smartphone platforms. Back to top

Nagui Halim, Director, Event and Streaming Systems IBM TJ Watson Research Center Nagui Halim, director of event and streaming systems at IBM Research, will discuss IBM's stream computing efforts and where he sees the field going.

Back to top

Levon Stepanian, Java JIT Compiler Development, IBM Canada Lab Stepanian was a volunteer member of the Vietnam 1 IBM Corporate Service Corps team that worked with Subject Matter Experts belonging to the Danang Chapter of the Vietnam Chamber of Commerce and Industry in September 2008. The main outputs of his assignment were training and education sessions surrounding the topics of Software Development Processes, focusing on requirements engineering and testing, as well as CMMI Certification, Power Meter Reading technologies, and Object Oriented Modelling and Design using IBM Rational products.

Levon Stepanian is a software developer with the Java Just-In-Time Compiler team at the IBM Toronto Lab. Prior to working at IBM, Levon completed his M.Sc. degree, working with Kevin Stoodley, Allan Kielstra and Gita Koblents from IBM, and Professor Angela Demke Brown from the University of Toronto.

Back to top

About IBM Privacy Contact Terms of use United States [ change ]

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 Hotel information CASCON 2008 Call for Papers Sponsors

Contacts CASCON Proceedings Full Papers CASCON Proceedings are available on the Registration & Sign-in ACM Digital Library High school competition

Related information IBM University Relations CASCON archives Programming Contest Central IBM alphaWorks IBM developerWorks CASCON mailing list DB2 for Academics Subscribe/Unsubscribe to WebSphere for Academics CASCON mailing list

CASCON 2008 Call for Papers

The 18th Annual International Conference on Computer Science and Software Engineering

October 27-30, 2008

Sheraton Parkway Toronto North Hotel and Conference Centre , Ontario , Canada

http://www.ibm.com/ibm/cas/cascon

Theme: Computing for Next Generation Systems

Sponsored by

IBM Toronto Software Laboratory, IBM Centers for Advanced Studies

in partnership with

National Research Council Canada

CASCON 2008 is the 18th Annual International Conference hosted by the IBM Centers for Advanced Studies. This "Meeting of Minds" provides an exciting forum for exchanging ideas and experiences in the ever-expanding and critical fields of software development and computing. The CASCON 2008 program will include keynote presentations, technical papers, workshops, and a technology showcase. The technical papers program will feature experience reports and original research papers. The technology showcase will feature poster presentations and demonstrations on research in progress. As such, CASCON 2008 will be an excellent venue for presenting original work, exchanging new ideas, sharing results and experiences, and networking with researchers and practitioners from academia, industry, and government.

Important Dates

May 5, 2008 : Abstracts for technical papers due May 26, 2008 : Full technical papers due June 21, 2008 : Acceptance notification to technical paper authors August 11, 2008 : Camera-ready technical papers due October 27-30, 2008 : Conference dates Separate Calls for Workshop and Technical Showcase submissions will be sent out to those subscribed to the CASCON email notification list.

To subscribe, register your email address at: https://www- 927.ibm.com/ibm/cas/cascon_main/subscribe.shtml

July 11, 2008 : Workshop proposals due August 15, 2008 : Technology showcase submissions due

Topics CASCON 2008 invites authors to submit original papers addressing any of the following topics:

Software Development Processes: development methods and techniques; measurement and metrics; agile methods; open-source and global software development; COTS based systems; requirements elicitation Software Design & Comprehension: software architecture; object-oriented techniques; program understanding and reengineering Reliability: verification and validation; model checking; software testing; reliability engineering; quality assurance Software Development Tools: compilers; integrated development environments; testing frameworks; supporting distributed and parallel computation; supporting collaboration and awareness Databases: query processing; information integration; semi-structured data; data mining; knowledge discovery; schema migration; digital libraries; data warehousing Security and Privacy: cryptography; access control; security in domain-specific applications; privacy and trust Services Science: service-oriented architectures; new computation and business models; legal, economic and societal issues; User Experience: interaction design; multimodal interfaces; privacy and trust; web-based interface design; development tool adoption Web-based Systems: communities of practice; wikis and blogs; scripting languages; privacy and trust models; mobile web; semantic web Autonomic Computing: provisioning and allocation of resources; policy-based management Grid Computing: Data management; scheduling; resource management; information services

Best Paper Awards

Two paper awards for Best Paper and Best Student Paper will be presented to recognize the best technical contributions of the event in terms of originality, clarity and potential impact.

To be eligible for the "Best Student Paper" award, a paper must have been primarily authored by a student and the work that the paper reports must have been undertaken by the student authors. Only the student authors of the Best Student Paper award receive a prize. The recipients must have been students at the time the work was undertaken.

Submission Instructions

The deadline for paper submissions is May 26, 2008 . However, please submit an abstract by May 5, 2008 to assist us with planning.

To submit a technical paper, please register at the submission site http://witanweb.ca/cascon2008/ . Once registered, instructions will be provided regarding how to submit your abstract and paper. The paper submission web site will open on March 16, 2008 .

Technical papers should be, at most, 15 2-column pages long, using 10pt font. Detailed style descriptions, as well as templates are available at the submission web site.

CASCON 2008 will NOT accept submissions that have been previously published, are in press or have been submitted elsewhere.

The proceedings from CASCON 2008 will be published by IBM, and will be included in the ACM Digital Library.

To submit a workshop proposal, poster presentation, or demonstration please refer to the conference site http://www.ibm.com/ibm/cas/cascon .

Please help us publicize CASCON 2008 by linking to the conference web site http://www.ibm.com/ibm/cas/cascon from your own Web site and by bringing this announcement to the attention of your colleagues.

We look forward to receiving your submissions and meeting you at CASCON in October.

General Chairs

Joanna Ng

Head, IBM Toronto Lab Center for Advanced Studies,

IBM Canada Ltd.

Christian Couturier

Director General, Institute for Information Technology,

National Research Council Canada

Program Chairs Mark Vigder

National Research Council Canada

[email protected] Marsha Chechik

University of Toronto

[email protected]

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration PCC Archive CASCON High School Programming Competition 2008 Download Instructions PCC 2007 Schedule Installation instructions PCC 2006 Download 1. Download "I Have a Code" (Right-click and choose 'Save Target As...'). PCC 2005 Q&A Session 2. Make sure you have the newest version of Eclipse (3.4 Ganymede). Get Eclipse. Teachers tech workshop 3. Unzip the file into the [Eclipse install dir]/eclipse directory, where [Eclipse install dir] is the directory in which Eclipse is installed. This action should create one or more files with names such Related Information as [Eclipse install dir]/eclipse/plugins/com.ibm.*.jar. CASCON Programming 4. Start Eclipse. Make sure you have JRE 6.0 installed: Get JRE 6.0. Competition 2007 Related links 5. Go to File->New->Project... A rundown of last year's IBM University Relations 6. Select Java->Java Challenge Project->. Click on Next. competition, including winners Programming Contest Central 7. Select CodeColony from the dropdown menu and enter a project name. Click on Finish. and participating schools. IBM alphaWorks 8. Select from the menu. Click . Window->Show View->Other... Java->Game More info IBM developerWorks 9. In the bottom Game panel, click Run a match. DB2 for Academics 10. Now you may add your own colony and sample colonies to play a match! WebSphere for Academics 11. You may view the manual from eclipse via Help -> Help Contents -> CodeColony Mailing list Subscribe/Unsubscribe to PCC mailing list Running a tournament 1. Set up one computer as the "host server". 2. With the host server, go to Window -> Preferences -> Java Challenge. Select the current CodeColony project under Project. Under server, click Start. The default 1234 server is fine. 3. With all of the computers (including the server host), go to the Game view. On the top right corner, click the small button. 4. There will be a window prompting for the Host. On the host server, localhost is correct. For other computers, type in the host server's IP address. The IP address can be found by running 'cmd' (command prompt) and typing 'ipconfig'. 5. At the bottom of the Game view, click submit code. You may have to stretch the Game view. 6. Once code has been submitted, that computer can see all of the currently submitted teams in the Test view. 7. You may now select these teams to play a match! 8. With the host server, under Window -> Preferences -> Java Challenge, you may also click Start Tournament and select teams to compete. Click Run Tournament to view the tournament.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central Registration PCC Archive Competition Results Schedule PCC 2007 Here are this year's champions for "I Have a Code": PCC 2006 Download PCC 2005 Q&A Session

Teachers tech workshop Related Information CASCON Programming Competition 2007 Related links A rundown of last year's IBM University Relations competition, including winners Programming Contest Central and participating schools. IBM alphaWorks IBM developerWorks More info DB2 for Academics WebSphere for Academics Mailing list Subscribe/Unsubscribe to PCC mailing list

First place Western Technical Commercial School David Benjamin and David Hu Teacher: Janina Rackoff IBM: Victor Zhang

Second place The Academy for Gifted Children (PACE) Scot Brown and Cory Snider Teacher: Lynda Yearwood IBM: Victor Zhang Third place A. Y. Jackson Jeff Wang and Yang Zhou Teacher: Aloysius Tam IBM: Victor Zhang

33 teams of two students participated this year. Many schools sent two teams, providing enough teams to require two rooms for the competition.

Participating schools: A. Y. Jackson Secondary School, Toronto, ON The Abelard School, Toronto, ON The Academy for Gifted Children (PACE), Richmond Hill, ON Anderson Collegiate, Whitby, ON Crescent School, Toronto, ON Crestwood Preparatory College, Toronto, ON Heritage Christian School, Jordan Station, ON Lester B. Pearon Collegiate Institute, Toronto, ON Michael Power/St. Joseph H. S., Toronto ON Northern Secondary School, Toronto, ON Pickering High School, North Ajax, ON Pine Ridge Secondary School, Pickering, ON Richmond Hill High School, Richmond Hill, ON Saint Joan of Arc, Maple, ON Saint Robert CHS, Thornhill, ON Saint Thomas Aquinas Secondary School, Oakville, ON Sir William Mulock Secondary School, Newmarket, ON Thornhill Secondary School, Thornhill, ON Thornlea Secondary School, Thornhill, ON , Toronto, ON Western Technical Commercial School, Toronto, ON William Lyon Mackenzie CI, Toronto, ON

Interesting Notes:

0.999... == 1 / Western Tech and Weakened Random Colony / Richmond Hill H.S. consistently beat Super Colony / University of Toronto (the creator's colony) Weakened Random Colony / Richmond Hill H.S. defeated all opponents in its first round System.out.println("win!"); / P.A.C.E. >:3 amassed the most points in a single round: 143897 In Rainbows / AYJ actually used a single Kamikaze cell Some matches did not play because they timed out. The full scores can be found here. Blue highlighting indicates the game was replayed.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC Competition results High School Programming Contest Central Registration Registration Process Schedule 1. E-mail Notification. Teachers only should subscribe to the CASCON High School PCC Archive Download Programming Competition e-mail notification as soon as possible to receive details about the PCC 2007 Q&A Session competition and its associated workshop. In September, only teachers on this mailing list will be sent an e-mail regarding registration. PCC 2006 Teachers tech workshop PCC 2005 2. Register. An e-mail will be sent to you once the CASCON High School Programming Competition registration site opens in early September 2008. At this time, you must complete the Related links online registration form, on which you will also register your 2 students. Related Information IBM University Relations CASCON Programming Programming Contest Central Space for the competition is limited, and is available on a first-come, first-serve basis, with only Competition 2007 IBM alphaWorks one student team eligible to participate per school. Teachers are advised to arrange one 2-person A rundown of last year's IBM developerWorks team before the registration date (participants must be in grade 12 or under), as participant competition, including winners and participating schools. DB2 for Academics information will be required at the time of registration. You will be able to change this information if WebSphere for Academics a substitution is needed. Attendance confirmation, guidelines for the competition, and all More info communication will take place via e-mail to teachers by the end of September 2008. One teacher Teachers technology must accompany each team while at the event. workshop 2008

3. CASCON Workshops (optional). All teachers are welcome to participate, whether or not they More info have registered a team. Also, as teachers will not be able to advise students during the competition, participation in a CASCON workshop is highly recommended. Mailing list An e-mail will be sent to you once the CASCON High School Programming Competition Subscribe/Unsubscribe to registration opens in September 2008. The workshops reserved for teachers are private; therefore, PCC mailing list further instructions on workshop registration are available by signing up to the mailing list.

Note: If you would like to participate in an alternate workshop, you may do so. However, this is not recommended due to time conflicts.

On Wednesday, October 29, 2008, all registered participants are required to arrive at the Sheraton Parkway Hotel to pick up their badges between 8:30am-9:00am. The program will begin promptly at 9:00am.

Sheraton Parkway Toronto North Hotel and Convention Centre is located at: 600 Highway 7 East (at Highways 404 & 7) Richmond Hill, Ontario L4B 1B2 Canada Phone: (905) 881 2121 Or 1-800-668-0101

A detailed map and instructions on how to get to the location will be provided upon registration.

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration CASCON High School Programming Competition & Teachers' Technology Workshop PCC Archive Schedule Itinerary PCC 2007 Download PCC 2006 Q&A Session Time Wednesday, October 29, 2008 PCC 2005 Teachers tech workshop 8:30am Registration 9:00am Keynote Speech 9:30am Related Information Related links Introduction and CASCON Programming 10:00am IBM University Relations Q & A Session Competition 2007 A rundown of last year's Programming Contest Central 10:30am CASCON Workshops competition, including winners IBM alphaWorks 11:00am Competition and participating schools. IBM developerWorks 11:30am DB2 for Academics More info 12:00pm WebSphere for Academics Lunch/Technology Showcase 12:30pm 1:00pm Mailing list 1:30pm Competition Resumes CASCON Workshops Subscribe/Unsubscribe to 2:00pm PCC mailing list 2:30pm Technology & Innovation 3:00pm (Guest Speaker) 3:30pm Tournament and Awards Presentation

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration PCC Archive Question and Answer Session PCC 2007 Schedule After the keynote speech, participants in the CASCON High School Programming Competition will PCC 2006 Download proceed to the competition room for a short question and answer session. While "I Have a Code" PCC 2005 Q&A Session will be explained briefly, this session is meant primarily for students who may have questions for the competition director regarding the coding process. Teachers tech workshop Related Information We assume that all participants have seen the material already, as it has been posted since September 2008. Therefore, it is to your advantage to take some time to prepare. CASCON Programming Competition 2007 Related links A rundown of last year's Note: you are not permitted to bring any written or typed code into the competition, with the IBM University Relations competition, including winners exception of reference book per team. If written code or printed pages of code are found Programming Contest Central one and participating schools. IBM alphaWorks within the reference book, your team will be disqualified immediately. IBM developerWorks More info DB2 for Academics WebSphere for Academics Mailing list Subscribe/Unsubscribe to PCC mailing list

About IBM Privacy Contact Terms of use Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration PCC Archive Teachers technology workshop 2008 PCC 2007 Schedule The IBM Center for Advanced Studies (CAS) will host its fourth educational workshop for high PCC 2006 Download school computer science teachers in conjuction with the CASCON High School Programming PCC 2005 Q&A Session Competition on Wednesday, October 29, 2008.

Teachers tech workshop The goal of the workshops are to provide ideas and share insights on how to make computer Related Information science and software development more appealing to students, and to suggest ways to promote interest in these very important fields of study. CASCON Programming Competition 2007 Related links Statistics from many universities have shown that there has been a decline in computer science A rundown of last year's IBM University Relations enrolment over the past several years. More and more students are moving away from a career in competition, including winners Programming Contest Central IT simply because they do not know how exciting a career in technology can be, or that they feel and participating schools. IBM alphaWorks that there will not be jobs for them when they graduate from university. IBM wants to help provide a IBM developerWorks More info truer picture of these wonderful opportunities. DB2 for Academics WebSphere for Academics This year, the workshop will be focusing on specific tools and skills that teachers can introduce to Mailing list enrich the classroom experience. More information is available by signing up for e-mail notification. Subscribe/Unsubscribe to If you have already done so, information regarding the workshops will be sent in September. PCC mailing list This year's workshop topics and presenters are as follows:

Workshop: Images in Java Additional Resources Presenter: Steve Engels, Lecturer, University of Toronto Scratch For more information about Photoshop is not magic, it's just applied computer science. In this workshop, you'll learn how Scratch feel free to visit the website. simple it can be to do some fun image manipulations, and how it can be a cool way to teach iteration and 2-dimensional arrays. Website

More about Steve Engels: Workshop Files (927KB) Steve Engels is a Senior Lecturer in Computer Science at the University of Toronto. Steve has a License Agreement reputation for lively and engaging classes, using props, demonstrations and audience participation to keep his lectures fun and interesting. He has won teaching awards at both the University of Toronto and the University of Waterloo and has also been a contestant in TV Ontario's Best Lecturer Competition.

His students have been sought after by many reputable companies such as Microsoft and EA Games, and have worked with Steve on a video game art exhibit at this year's Nuit Blanche art event.

Workshop: Teaching computer science: a search for fundamentals Presenter: Troy Vasiga, Lecturer, University of Waterloo

This talk will focus on finding fundamental concepts for what computer science is, and how educators can go about focusing on these fundamental concepts to improve the educational experience for their students.

Workshop: Scratch: Introducing computer science using a fun, dynamic programming environment Presenters: Deirdre Athaide (IBM), Jennifer Schachter (IBM), Lisa Rubini (Toronto District School Board)

This hands-on workshop introduces and secondary school teachers to the wonders of 'Scratch'. Scratch is a dynamic programming environment developed by the Lifelong Kindergarten Group at M.I.T.'s Media Lab to engage students of all ages in computer programming. It has been successfully used to excite middle school students as part of IBM's EX.I.T.E. camp and as the introduction to programming in a Grade 10 computer science classroom. During the workshop creative techniques for teaching basic computer science concepts will be demonstrated using problem sets and discussion. Participants are encouraged to explore ways to incorporate Scratch into their teaching practice.

About IBM Privacy Contact Terms of use

Canada [ change ] English - Français

Home Business solutions IT services Products Support & downloads My IBM

CAS Worldwide Homepage CASCON 2008 CASCON 2008 HSPC

Competition results High School Programming Contest Central

Registration PCC Archive PCC 2007 Schedule PCC 2006 Download PCC 2005 Q&A Session

Teachers tech workshop Related Information The teacher's "I Have a Code" page! Related links If you're interested in IBM University Relations participating, pass this link on Programming Contest Central to your teacher and encourage IBM alphaWorks them to sign you up! IBM developerWorks More info DB2 for Academics WebSphere for Academics Mailing list Subscribe/Unsubscribe to PCC mailing list

This year your Java challenge is "I Have a Code"! "I Have a Code" is a gaming simulator challenge, created by university students here at Programming Contest Central. "I Have a Code" is loosely based on the trials and tribulations of cell colonies in the human body. Code super-cold cells to work in unison and form a colony that will grow, expand, split, specialize, and attack other cell colonies. You will be strategizing and programming in Java to compete for the highest score.

You will be competing in teams of two, one pair from each participating school. The "I Have a Code" Challenge allows direct, real-time competition between teams. After three rounds of a round-robin tournament, the final winner will be determined through a tournament consisting of several rounds and eliminations.

Details are as follows:

Date: Wednesday, October 29, 2008 Place: Sheraton Parkway Toronto North Hotel and Convention Centre, 600 Highway 7 East (at Highways 404 & 7), Richmond Hill, Ontario Time: 8:30 am to 4:00 pm. What To Bring: One piece of printed resource material (like a textbook, NO PRINTED OR WRITTEN PAPERS ARE ALLOWED). Breakfast and lunch will be provided at no cost.

At the end of the day, your team's colony will be matched against the others in a final showdown. Awards will be given to the top 3 teams. If you are interested, talk to your teacher. Starting September 19, teachers can register the school teams online.

As for "I Have a Code", there are a couple of things you will need to do to get it running on your computer.

Instructions I. Download a Java Runtime Environment (JRE). Version 6.0 or higher is recommended to run "I Have a Code" (if you see 1.6.0 on the "I Have a Code" site, don't worry, it's the same thing).

Get JRE 6.0

II. Download Eclipse. Version 3.4 or higher is recommended. Extract into an accessible folder (like on your desktop).

Get Eclipse IDE

III. Download "I Have a Code". The latest version. Installation instructions are on the site. Get "I Have a Code"

IV. Read the Explanation. It's useful, so take advantage of it! It's quite long, but it really helps to clear up the game. (Right- click and choose 'Save Target As...')

Download manual

Prepare as much as possible beforehand, but be warned: on the day of the contest, you will be able to use one book or other published resource material, but no other programs or written material of any kind will be permitted. Be careful while preparing, though. There is a multi-part scoring system in place (found in the "I Have a Code" manual). When developing your strategy, make sure you take this into account!

If installed properly, a running game should look something like this:

The Start of an "I Have a Code" Game

And as the game progresses...

The Middle of an "I Have a Code" Game

Number of cells, types of cells, and elimination of other cells contribute to your final score. Do what you will, but the colony with the highest score after a full day (or approximately a minute in real time), shall be declared the winner.

Your next steps A guide to information about registration. Competition Itinerary A schedule of what will happen on the competition day. Subscribe to e-mail notification Make sure you register to receive registration information and competition updates! CASCON Programming Competition 2008 The student's competition page. Pass this link on to any students interested in participating in the competition.

About IBM Privacy Contact Terms of use