Masaryk University Faculty of Informatics

Case Management Task Assignment Using OptaPlanner

Master’s Thesis

Bc. Marián Macik

Brno, Fall 2016 Replace this page with a copy of the official signed thesis assignment anda copy of the Statement of an Author. Declaration

Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source.

Bc. Marián Macik

Advisor: Mgr. Marek Grác, Ph.D.

i Acknowledgement

Here I would like to thank my family, friends and colleagues for their support during the work on this thesis. Moreover, I thank my consul- tant, Mgr. Ivo Bek, for guiding me during my work and for helping me overcome issues experienced when writing this thesis. I would also like to thank my advisor, Mgr. Marek Grác, Ph.D., for his help regarding the text of the thesis and for his advice in initial stages of the work. Finally, I would especially like to thank Maciej Swiderski for his help with the configuration of jBPM engine and Geoffrey De Smet for his help with OptaPlanner.

ii Abstract

The aim of the thesis is to analyse, design and implement a module for automated task assignment by integrating jBPM engine and Op- taPlanner. The thesis describes case management, its difference from business process management and their notations. After that, jBPM engine together with OptaPlanner are explained. In the second half of the thesis, the actual implementation and prototype application are presented, including the performance tests in different OptaPlanner configurations and scenarios.

iii Keywords jBPM, OptaPlanner, Case Management, Task Assigning, Planning

iv Contents

1 Introduction1

2 Resource and Case Management3 2.1 What is Resource and Case Management? ...... 3 2.2 Case Management vs. Business Process Management ....4 2.3 Notations ...... 6 2.3.1 CMMN...... 6 2.3.2 BPMN...... 14 2.4 Focus on Case Management Task Assignment ...... 18

3 jBPM Engine 21 3.1 Introduction ...... 21 3.2 Implementation of Case Management in jBPM Engine ... 22 3.3 Task Assignment inside Case Management ...... 23 3.4 Task Lifecycle ...... 24 3.5 Alternatives ...... 24 3.5.1 IBM...... 24 3.5.2 Oracle...... 25 3.5.3 Camunda...... 26

4 OptaPlanner 27 4.1 Introduction ...... 27 4.2 Constraints and Score Calculation ...... 28 4.2.1 Constraints...... 28 4.2.2 Score Calculation...... 29 4.3 Solver Lifecycle ...... 31 4.3.1 Solver Phase...... 31 4.3.2 Steps and Moves...... 32 4.4 Construction Heuristic ...... 34 4.5 Local Search ...... 35 4.5.1 Hill Climbing (Simple Local Search)...... 36 4.5.2 Tabu Search...... 37 4.5.3 Simulated Annealing...... 38 4.5.4 Late Acceptance...... 38 4.6 Alternatives ...... 38 4.6.1 LocalSolver...... 39

v 4.6.2 IBM ILOG CPLEX Optimization Studio..... 40 4.6.3 Google Optimization Tools...... 41

5 Resource Management Module 42 5.1 Analysis ...... 42 5.1.1 Users...... 42 5.1.2 Tasks...... 42 5.2 Configuration of jBPM Engine and its Domain ...... 43 5.2.1 Persistence and Persistence entities...... 43 5.2.2 Inserting and Retrieving Users to/from DB... 45 5.2.3 Listener for Pushing Tasks from jBPM Engine. 46 5.3 Configuration of OptaPlanner and its Domain ...... 47 5.3.1 Planning Entities, Planning Values and their Re- lationships...... 47 5.3.2 TaskAssigningSolution...... 50 5.3.3 Score rules...... 51

6 Prototype Application 54 6.1 Purpose and Used Technologies ...... 54 6.2 Implementation ...... 54 6.2.1 RootLayout and SolutionController...... 54 6.2.2 CaseOverview and CaseController...... 55 6.2.3 Dialogs and Dialog Controllers...... 55 6.3 Scenarios ...... 55 6.3.1 Small Scenario...... 56 6.3.2 Medium Scenario...... 56 6.3.3 Large Scenario...... 56 6.4 User Guide ...... 57

7 Benchmarks 60 7.1 OptaPlanner Benchmark ...... 60 7.2 Construction Heuristics Benchmark ...... 61 7.3 Benchmark ...... 61

8 Issues Experienced and Further Improvements 64 8.1 Issues Experienced during Implementation ...... 64 8.2 Further Improvements ...... 64

vi 9 Conclusion 66

A Zip Archive 67

Bibliography 68

vii List of Tables

5.1 TaskPlanningEntity attributes 50 6.1 Small scenario tasks 56 6.2 Medium scenario tasks 57 6.3 Large scenario tasks 57

viii List of Figures

2.1 Design-time Phase and Runtime Phase Source: CMMN Specification [3] 8 2.2 Case File Item Shape Source: CMMN Specification [3] 8 2.3 Collapsed Stage and Expanded Stage Shapes Source: CMMN Specification [3] 9 2.4 Task Shape Source: CMMN Specification [3] 10 2.5 Timer Event and User Event Listener Shape Source: CMMN Specification [3] 12 2.6 Entry and Exit Criterion Shape Source: CMMN Specification [3] 13 2.7 Connector between two Tasks Source: CMMN Specification [3] 13 2.8 An Example of CMMN Notation Source: CMMN Presentation [15] 14 2.9 Gateway Shape Source: BPMN Specification [12] 17 2.10 Sequence Flow Source: BPMN Specification [12] 18 2.11 An Example of BPMN Notation Source: Author 19 4.1 Scope Overview Source: OptaPlanner Documentation [30] 32 4.2 An Example of OptaPlanner Phase Sequence Source: OptaPlanner Documentation [30] 33 4.3 Move Source: OptaPlanner Documentation [30] 34 4.4 Steps Source: OptaPlanner Documentation [30] 35 4.5 Hill Climbing Source: OptaPlanner Documentation [30] 36 4.6 Tabu Search Source: OptaPlanner Documentation [30] 37 4.7 Simulated Annealing Source: OptaPlanner Documentation [30] 39 4.8 Late Acceptance Source: OptaPlanner Documentation [30] 40 5.1 Persistence Entities Source: Author 44 5.2 Users JSON Source: Author 45 5.3 OptaPlanner Relationships Source: Author 48

ix 5.4 Examples of Chains Source: OptaPlanner Documentation [30] 49 5.5 Role Requirements Score Rule Source: Author 51 5.6 High Priority Score Rule Source: Author 52 5.7 Minimize Makespan Score Rule Source: Author 52 6.1 Prototype Application Source: Author 59 7.1 Results of Construction Heuristic Benchmarks Source: Author 62 7.2 Results of Metaheuristics Benchmarks Source: Author 62 7.3 Score Progress During Time Source: Author 63

x 1 Introduction

Our contemporary society is constantly improving and developing. Work is being more and more automated and its amount is rapidly increasing too. Bigger amount of work necessarily needs better man- agement and organization, better and faster planning and very quick and efficient handling of unexpected situations. In the past, usually only a small number of employees from high management managed and planned the work for all employees in the organization. However, with the amounts of work nowadays, it is very difficult to manage, organize and plan this work by a couple of persons. This often means that these planning decisions need to be made by everybody, which takes a lot of time and, therefore, it is inefficient. Because of that, effi- cient planning and organization which lead to the maximal utilization of employee’s potential is becoming more and more valued. And with the help of communication and information technologies, this efficient planning is becoming very powerful, cost-efficient and easier to be handled. Therefore, this thesis tries to find such a solution and tries to inte- grate system which is daily used in practice with the system which is designed to optimize not only issues mentioned but other issues as well. The aim of this thesis is to analyze, design and implement a module for automated task assignment using resource planning tool OptaPlan- ner. This implementation then should be visualized by a prototype application so a user may easily see differences between different heuristics and strategies. Once done, a performance comparison in different scenarios should be made. This thesis acts like a research with focus on possibilities of in- tegrating case management in jBPM engine [1] and OptaPlanner [2] since there is a high probability these two tools will be integrated offi- cially by the community. Therefore, this thesis may serve as a starting point, which might be enhanced, extended and then widely used. An initial analysis is done in the second chapter. Since case manage- ment is still quite new (initial specification is from 2014 [3], it compares case management and business process management, lists the main differences and provides a quick view at their particular notations.

1 1. Introduction

At last, it explains the reason why case management was primarily chosen for the thesis. Third chapter presents jBPM engine, explains why it was chosen and explains some of its specifics, e.g. task lifecycle. In the end, the engine is compared with alternative software suites which address same issues. OptaPlanner is explained in the fourth chapter. It explains which mechanisms OptaPlanner uses and what problems it may be used for. As in the third chapter, alternatives are presented too. Fifth chapter consists of analysis, design and implementation of the resource management module. Module’s domain model is presented together with the configuration of both tools including the scoring rules of OptaPlanner. Sixth chapter briefly describes prototype (demo) application, tech- nologies used and its implementation. Furthermore, example data in the form of scenarios is explained. In the end, the user guide is presented with the steps to start the application. Seventh chapter provides benchmarks with different OptaPlanner configurations. At first, it briefly describes OptaPlanner benchmark, its features and configuration. After that it separately analyses construc- tion heuristics and metaheuristics (local searches) and lists benchmark tables and graphs. Last part lists issues experienced during the implementation and suggests possible enhancements to the application and module itself. Finally, the outcome and possible impact of the thesis are presented.

2 2 Resource and Case Management

2.1 What is Resource and Case Management?

To better understand the content of the thesis, it is necessary to explain what Resource and Case Management (RCM) exactly is and what is its purpose. At first, it needs to be pointed out that the term “resource” isusu- ally omitted, because Resource Management is frequently perceived as a necessary part of Case Management. Therefore, the term Case Man- agement (CM) is used in this thesis in the meaning of both Resource and Case Management. According to the Case Management Society of America [4] and the Commission for Case Manager Certification [5], case management is defined as “a collaborative process that assesses, plans, implements, coordinates, monitors, and evaluates the options and services required to meet the client’s health and human service needs. It is character- ized by advocacy, communication, and resource management and promotes quality and cost-effective interventions and outcomes.” As the definition suggests, it can be used in a wide range of fields where various tasks are completed in a collaborative way. However, it is nowadays primarily applied in the following areas:

∙ Health system (Medical Case Management)

∙ Law (Legal Case Management)

∙ Employment

∙ Social work

Medical Case Management is used to assure that the recommended and the most appropriate medical care is provided to disabled, ill or injured individuals. For example, it makes it easier to track visits of one particular patient at different doctors in case of more complex medical treatment. It might also be helpful when a patient needs to visit special departments of a hospital like an X-ray or magnetic resonance imagery to obtain individual images of his injury. In such cases these images

3 2. Resource and Case Management

can be directly attached to a patient’s case file in an electronic form so doctors can see these images almost immediately. Case Management can also prevent these images from being exposed to public when the case is properly secured and only authorized doctors are allowed to access these assets. Moreover, these practices also help to lower total costs of such treatment. Legal Case Management is widely adopted by law firms or courts to manage the lifecycle of a legal case more effectively. As law firms compete for clients on the market, they try to deliver services at the lowest possible costs and with the highest possible efficiency. There- fore, they often use Case Management for sharing documents obtained during a research or data mining, for securing these documents and for archive accessibility of legal cases in the future. With the help of CM, firms can easily browse completed cases and reuse same practices in the similar cases. They can even avoid mistakes done in the past which optimizes future work on cases. Finally, when lawyers work in a team, CM helps them enhance routine processes used during their work, so it enhances productivity of the whole team. Apart from the health system and law, new applications emerge nowadays, which are mostly concentrated in employment and social work. In these fields the individual approach toward every client is cru- cial, so CM is a perfect solution. Moreover, these fields are document- oriented as well, thus organization of all documents which are a core of each client’s case is another reason why one might want to use CM.

2.2 Case Management vs. Business Process Management

Case Management (CM) and Business Process Management (BPM) are two distinct approaches which are often mentioned together when one talks about management practices; sometimes people even tend to use these terms interchangeably. However, each of these practices is suitable for quite different scenarios. Therefore, clear distinction between CM and BPM has to be made [6]. The main difference between these two techniques is in the type of processes (inside a company or an organization) which they are used for. Hence, in order to properly determine which technique is

4 2. Resource and Case Management

the best for a particular work, it is necessary to know as much infor- mation about the organization and this work as possible. When there is enough information, a decision which technique to choose is quite simple in most cases. Case Management is an approach which is used for tasks inside an organization which are unique or rarely repeatable. This kind of work is called a case in CM. A case can be even viewed as a situation which does not necessarily require a particular flow of activities. It is often used in situations which are hardly predictable or unclear and thus the individual activities or tasks cannot be described in advance. Therefore, CM provides tools for managing milestones and goals and tools for organizing all information of a case such as documents. Every tool in CM is used on demand so there is no point in optimizing a whole flow of activities. In fact, such an optimization might cause more harm than good because it costs many resources and the result does not balance the overall cost of optimization due to the quickly changing nature of the optimized flow. In conclusion, CM should be used when there is no flow for a situation we try to describe or the flow is changing so fast that the previous optimization would be useless. On the other hand, there is Business Process Management which is defined by The Workflow Management Coalition [7], BPM.com [8] and many other sources [9] as “a discipline involving any combina- tion of modelling, automation, execution, control, measurement and optimization of business activity flows, in support of enterprise goals, spanning systems, employees, customers and partners within and beyond the enterprise boundaries.” As this definition denotes, BPM is mostly used in situations when there is a business activity flow present. Within BPM, these flows are called business processes (see 2.3.2). Be- cause there is always a particular flow present, it is predictable and repeteable many times. In such a case, the process can be monitored and optimized constantly throughout its lifetime. Therefore, the life- time of the process should be long enough so the optimization can pay off. Because of that, business processes are mostly described and used in mass production environments where many process instances are expected to be run. In these environments even a small optimiza- tion can have a huge effect on a company or an organization. As said before, after the optimization the process instances can be constantly monitored to provide additional data as a feedback. This data might

5 2. Resource and Case Management

help in the further optimization or can discover potential problems in the process. To sum up, BPM should be used for modelling, moni- toring and optimization of business processes which are rather strict, predictable and repeatable many times. These two approaches are sometimes considered as competing. But as said before, each of them is suitable for different scenarios and they should be in fact considered as complements. In practice, it is common that a business process calls a case and vice versa. Some workflow engines even use the same notation, i.e. BPMN, for both CM and BPM. This unified representation may provide benefits inthe design and the speed of the engine.

2.3 Notations

The differences between Case Management (CM) and Business Process Management (BPM) are even more visible in the notation each of them uses as a standard. Both specifications are now maintained by OMG (Object Management Group) [10] consortium, whose members [11] are for example Hewlett-Packard, IBM, Microsoft or Oracle. CM uses CMMN (Case Management Model and Notation) [3] as a default notation. On the other hand, BPM uses BPMN (Business Process Model and Notation) [12]. While the default specification for CM is CMMN, it is not uncommon to see Case Management tools which work with BPMN notation, e.g. jBPM. Despite that, many CM principles can be used in the same way as if CMMN was used.

2.3.1 CMMN CMMN notation is quite new. Its first version 1.0 has been released in May 2014, while version 1.1 has been released in December 2016 [3]. Because the whole specification presentation is beyond the scope of this thesis, only the basic notation elements are presented and described. Further information is available in the documentation.

Case A Case is the main part of CM. It represents the organization’s work or activity we want to describe and is mostly unpredictable as said before.

6 2. Resource and Case Management

However, there are many other definitions of the term case, depending on the field it is used in. For example, in health system, the caseis defined as “an instance of disease with its attendant circumstances” [13]. This definition suggests that even though there is one concrete disease, its own instance behaves uniquely for each patient. Therefore, each instance needs individual treatment, so the case representation is appropriate in this situation. Another example of the definition comes from legal system. It defines the case as “a general term for any action, cause of action, lawsuit, or controversy. All the evidence and testimony compiled and organized by one party in a lawsuit to prove that party’s version of the controversy at a trial in court” [14]. This definition actually has two points. On one hand, it describes thecase in an abstract form as any action or lawsuit that might happen. On the other hand, it specifies that the case might have a physical representa- tion in a form of all documents or items which were collected to serve as an evidence in a lawsuit. At last, the CMMN specification describes it as “a proceeding that involves actions taken regarding a subject in a particular situation to achieve a desired outcome. ...The subject of a Case may be a person, a legal action, a business transaction, or some other focal point around which actions are taken to achieve an objective” [3]. A Case has two phases, the design-time phase and the runtime phase. In the design-time phase, business analysts model the case by defining tasks that are always part of the case model andtasks that are discretionary. Discretionary tasks are then available to a case worker to be applied if needed. In the runtime phase, case worker runs the plan by executing predefined tasks. However, as the plan may continuously change throughout the execution, discretionary tasks may be added and executed in runtime to reflect these changes. The figure 2.1 describes these two phases.

Case File and Case File Item

A Case File Item represents any information, structured or unstruc- tured, which is used in a Case. It can be an XML document, a folder or a document in a Content Management System (CMS), or even a whole folder hierarchy [3].

7 2. Resource and Case Management

Figure 2.1: Design-time Phase and Runtime Phase Source: CMMN Specification [3]

Figure 2.2: Case File Item Shape Source: CMMN Specification [3]

All Case File Items of a particular Case form together a Case File. Each Case is linked with exactly one Case File. Therefore, all the information of the Case is represented by such a Case File. The Case File Items can then be used in various expressions or as inputs and outputs of Tasks in the Case Plan Model. For instance, the typical Case File Item is patient’s medical record. Shape of the Case File Item is presented in the figure 2.2.

Plan Items Plan Items are the main elements which represent the initial plan of the Case and they support evolution of this plan via a runtime planning by a case workers. There are four basic types of Plan Items:

∙ Plan Fragments and Stages

∙ Tasks

∙ Milestones

8 2. Resource and Case Management

∙ Event Listeners

Plan Fragments and Stages

A Plan Fragment can be described as a simple container for Plan Items which usually depend on each other and thus represent a pattern in the Case. These dependencies are defined as Sentries (see 2.3.1). A Plan Fragment cannot be tracked as it does not have its own representation in runtime, so there is not any concept of lifecycle tracking in the Case instance context. A Stage is a specialization of a Plan Fragment which has its own runtime representation, has its own lifecycle and therefore can be tracked. The Stage simply represents a building block for a Case. Each Stage represent a Case episode, a subset of activities which can be grouped together. It can be perceived as a subcase too. An example of such a Stage is a patient’s rehabilitation after his treatment, e.g. a surgery. CMMN notation for Stage is displayed in the figure 2.3. Moreover, the Stages as well as the Plan Fragments can be nested, i.e. the Stage can be put inside another Stage, thus generating a hier- archy of Stages. The Case Plan Model can be perceived as the most outer Stage of the whole Case. The Stage might also be discretionary which means it may not be executed in every Case instance. This is one of the constructs which allows CM to model unpredictable situations.

Figure 2.3: Collapsed Stage and Expanded Stage Shapes Source: CMMN Specification [3]

9 2. Resource and Case Management

Tasks A Task is an atomic unit of work. Its representation in CMMN notation is shown in the figure 2.4. The Task is a base class for all Tasks in CMMN. Every task can be blocking or non-blocking. If it is blocking, the Task always waits until the work associated with it is completed; on the other hand, if it is non-blocking, the Task does not wait and completes immediately after an instantiation. There are three main types of Tasks in CMMN: ∙ Human Task ∙ Process Task ∙ Case Task

Figure 2.4: Task Shape Source: CMMN Specification [3]

Human Task A Human Task is a Task which is performed by a human, usually a Case worker. When a Human Task is non-blocking, it is considered as a manual Task, thus the CM system is not tracking its lifecycle. An Example of a blocking Human Task can be patient’s initial examination performed by a doctor.

Process Task A Process Task is a Task which is used to call a Business Process in the Case. When a Process Task is blocking, the Process Task waits until the called Business Process is completed. However, if it is non-blocking, the Process Task does not wait for the Business Process to complete and the Task is completed immediately after the instantiation and the calling of this Business Process.

10 2. Resource and Case Management

Case Task A Case Task is used to call another Case in CM system. This Task creates a new instance of the specified Case. The main difference between the Case Task and a Stage is that the Case Task has its own context, i.e. its own Case File, whereas the Stage shares the same context with the outer Stage, which is a Case Plan Model in the most outer layer. As with a Process Task, the blocking Case Task waits until the associated Case is completed, while the non-blocking Case Task is completed right after its instantiation and invocation of the associated Case.

Milestones A Milestone can be described as an achievable target in a lifecycle of a Case. It is often used for an evaluation of a Case progression. It has to be said that no work is directly associated with the Milestone. Instead, the completion of a set of tasks or the availability of some information is typically the prerequisite for achieving the Milestone. An example of a Milestone can be a patient’s full recovery, which is considered as a successful treatment.

Event Listeners In CMMN an Event is considered as something which happens during the Case progression. Events trigger activation, enabling or termina- tion of Tasks, Stages or even Milestones. In CMMN, anything that happens to information in the Case File is denoted by transitions in the lifecycle of Case File Items while anything that happens to Tasks, Stages or Milestones is denoted by transitions in the lifecycles of these. However, some Events cannot be modelled via these “standard events”, e.g. an elapse of time or user events. If these events were modelled as these “standard events” they would have to be captured through an impact on information in the Case File or through transi- tions in Stages, Tasks or Milestones, which is a very indirect method. To solve this problem, CMMN presents a concept of Event Listen- ers, which have their own lifecycle, so every elapse of time and every user event are able to be captured as events which have their own

11 2. Resource and Case Management

transitions in their own lifecycles. This enables CMMN to uniformly handle all kind of Events. These Events are then handled via Sentries (see 2.3.1). As said before, there are two types of Event Listeners: Timer Event Listener and User Event Listener. Timer Event Listener is used to catch a predefined elapse of time while User Event Listener is used to catch events that are raised by a user. User Event Listener enables user to directly interact with the Case, instead of an indirect interaction by impacting information in the Case File. The figure 2.5 shows these Event Listeners in CMMN notation.

Figure 2.5: Timer Event and User Event Listener Shape Source: CMMN Specification [3]

Sentry A Sentry observes important situations, i.e. Events, which might occur during a Case and influence its progress. Each Sentry is a combination of an event and/or condition. When the specified event is observed, a condition is applied over a Case File to evaluate the effect of the event. A Sentry which is satisfied triggers the Plan Item that refers toit. Every Plan Item can refer to a Sentry as an Entry Criterion or Exit Criterion. When a Plan Item refers to a Sentry as an Entry Criterion and this Sentry is satisfied, the Plan Item is enabled in case of aTaskor a Stage or achieved in case of a Milestone. In the situation with an Exit Criterion, the Plan Item, in this case a Task or a Stage, is terminated. An Exit Criterion is not applicable for a Milestone, since it is viewed only as a target which is met, thus its completion is instant. Criterions are presented in the figure 2.6.

Connectors Connectors serve as dependencies between elements, which are shown in expanded stages or plan fragments. The shape of a connector is

12 2. Resource and Case Management

Figure 2.6: Entry and Exit Criterion Shape Source: CMMN Specification [3]

a dotted line, which must not have arrowheads. This requirement might be understood as a way of emphasizing that connectors are not considered as a flow, but as said before, as a dependency.

Figure 2.7: Connector between two Tasks Source: CMMN Specification [3]

Example of a Case in CMMN As the last part of the CMMN notation section, an example of a case model is presented in the figure 2.8. The case model represents a treatment of a patient’s fracture. Since this case is from the health system, patient’s file acts as the main case file item. The case starts with the patient’s examination where the doctor decides on further actions. If the injury is minor, only the sling is prescribed and the doctor waits for patient’s full recovery. In case of more serious injury, the patient has to visit an X-ray. After that, the complex treatment begins as depicted by the stage called Treatment. When the treatment is over, the rehabilitation of a patient is started (stage Rehabilitation). Moreover, during the whole case, the doctor has an ability to prescribe a medication (discretionary task Prescribe Medication) for the patient. The whole case may terminate in two ways. First, the patient has fully recovered from the injury, thus the milestone is achieved and by the exit criterion the case terminates. Secondly, although this is rather

13 2. Resource and Case Management

Figure 2.8: An Example of CMMN Notation Source: CMMN Presentation [15]

for illustration purposes, the patient may be healed miraculously with- out any treatment during the case progress. In this case, the event listener is triggered and the whole case terminates with the second exit criterion. In reality, this situation is not that common for the frac- tures, but it may be useful for diseases which are not well-documented yet. Therefore, this second way of termination might provide helpful feedback for the doctors too, especially in research.

2.3.2 BPMN

BPMN notation is much older than CMMN, considering its first ver- sion has been released in 2004. Since then, it has been revised con- stantly and the current major version is 2.0 with the latest minor

14 2. Resource and Case Management version from December 2013 listed as 2.0.2 [12]. Because of that, this specification is even more complex than CMMN and as before, only the basic notation is presented. More information may be found in the BPMN specification document.

Process

A process, often called a business process, is the main part of BPM. It represents a particular organization’s flow of work which is usu- ally predictable and repeatable many times, as said before. However, the very first description of a process was given by the economist Adam Smith in 1776 when he presented his ideas of labour division. Since then, many other definitions of a business process have been presented, but the basic idea behind is similar. For instance, Thomas H. Davenport [16], an American academic and author specializing in business process innovation, describes process as “a structured, measured set of activities designed to produce a specific output for a particular customer or market.” Moreover, James A. Champy and Michael M. Hammer [17], known for their work in the field of busi- ness process reengineering, define process as “a collection of activities that takes one or more kinds of input and creates an output that is of value to the customer. A business process has a goal and is affected by events occurring in the external world or in other processes.” Finally, BPMN specification itself describes the process as “a sequence orflow of activities in an organization with the objective of carrying out work” [12]. All these definitions emphasize that the goal or objective ofeach process is to produce some output which is valuable for the organiza- tion, a customer or a market. This output is usually considered as a product or a service which is delivered to a customer or a market. Business processes might be very large in a number of activities and inputs, since they might model work of big organizations. Therefore, to better understand the particular business process, BPMN defines a graphical representation of it. Every process in BPMN is presented as a graph of flow elements. The basic flow elements (activities, gateways and sequence flows) are briefly presented. Other, more comprehensive elements such as events, are documented in the specification.

15 2. Resource and Case Management

Activities An activity is a piece of work which is done during the business process. There are three main types of activities:

∙ Task ∙ Sub-Process ∙ Call Activity

Task As in the CMMN, a task is an atomic unit of work and its shape in BPMN notation is the same as in CMMN, see the figure 2.4. Thus, it is used when the work in the process cannot be broken down to a smaller level. There exist many types of tasks, which are different in a type of work they handle. In this section, 3 different tasks are shortly described:

∙ Service Task ∙ Human Task ∙ Script Task

A service task is used in situations when some sort of service needs to be called. In the real world, it might be REST service, Web service or any other automated application, which for example operates in- dustrial machines. A human task is useful in the situations when a person has to perform work, but with the help of some software. This work is usu- ally tracked or logged in this software too. An example is an order confirmation in the ordering software. This is different from amanual task, which is used when a person does work manually, e.g. loads a truck with goods. In this case, the person does not use any software, therefore the manual task should be used. A script task contains a script in a programming language which BPM engine understands. When such a task is ready to be started, the BPM engine executes this script. After the execution, the task is completed.

16 2. Resource and Case Management

Sub-Process and Call Activity Sub-Process is basically a process which is modelled, displayed and executed inside a parent process. It is often used when one wants to make a BPMN diagram clearer and wants to execute the process within the same context as the parent process. On the other hand, a call activity presents a point in a process where a global process or a global task can be called. In this case the control is transferred to the called process or the task. More information about these complex activities is available in the specification [12].

Gateways Gateways are used to control the flow of a business process. They allow a business process to diverge and converge throughout its ex- ecution. They can model various behaviours including branching, forking, merging and joining. In case of a divergence, the work of the business process might be executed in multiple branches in parallel or exclusively in one branch. On the contrary, regarding a convergence, the execution of a process either waits for other branches to finish or continues further. The Gateway is presented in the figure 2.9.

Figure 2.9: Gateway Shape Source: BPMN Specification [12]

Sequence Flow A sequence flow determines the order of other flow elements usedin a process. In comparison to connectors used in CMMN, the sequence flow must have an arrow since the ordering of the elements is explicitly stated in BPMN, see the figure 2.10. The sequence flow can contain an expression too, which is used as a condition for a gateway.

17 2. Resource and Case Management

Figure 2.10: Sequence Flow Source: BPMN Specification [12]

Example of a Business Process in BPMN

In the last part of the BPMN section, a simple business process example is presented in the figure 2.11. It is a BPMN model from the financial sector and it represents a loan application. This is a typical use case for BPM as the loan application process is known in advance and needs to be processed uniformly each time the application is received by a bank. The business process starts when a client applies for a loan. This is modelled as a human task because usually some sort of a form has to be filled out. The most important field in this form is an amountof money which the client wants to borrow. After the submission of the application form, the task is completed and further actions are taken. At first, the business process engine, which executes the process, checks if the amount is more than 1000. If it is not, the upper branch of the gateway is activated. In this example, a service task is executed and an automatic email confirmation is sent to the client. Thus, the loan application is automatically approved and the process ends. In the second case, when the amount is more than 1000, a manual approval by a bank employee is needed. Therefore, another human task is executed, during which the bank employee decides if the client is eligible for the loan of that amount. If the client is eligible, a con- firmation email with further details is sent. Notice that this email confirmation is now represented as a human task because it is written by the employee, since loans of higher amounts are treated individ- ually with each client. Finally, if the client is not eligible, a similar rejection email is sent.

2.4 Focus on Case Management Task Assignment

Even though both BPM and CM use a concept of tasks and their assignments to appropriate people, only CM task assignment has

18 2. Resource and Case Management

Figure 2.11: An Example of BPMN Notation Source: Author

been chosen as the scope of this thesis. There are a couple of reasons for that which are connected with the ways these techniques are used. At first, since CM is usually used in the situations when there is hardly any information about the flow upfront, it makes sense to automatically assign tasks in such ad-hoc environments primarily; simply because of the unpredictable nature of cases where decisions have to be made quickly at runtime. On the other hand, business process in BPM is much more predictable and focused on automation, so a number of human tasks and resources, i.e. people, is almost always known before the process execution. Because of this complex information available, ad-hoc decisions are minimized and therefore such automatic assignment is not needed as much as in the situation with CM. Secondly, many BPM applications nowadays have tools that pro- vide simulations of business processes, since data needed for the simulations are known at design time. These simulations are then used to optimize the whole process, including the task assignment. However, since CM is missing the flow data at design time, there are no simulations of the cases, thus another tool for optimization is

19 2. Resource and Case Management needed. Because of these reasons, the thesis is focused on the task assignment inside CM.

20 3 jBPM Engine

3.1 Introduction

While BPMN and CMMN specifications serve as contracts for BPM and CM respectively, there are many implementations of these spec- ifications nowadays. Each implementation is usually presented asa software suite with many supporting tools and additional services which try to persuade a potential user to choose it over another one. Further criteria like performance or integration capabilities are often considered as well. Probably the most crucial part of every BPM or CM software suite is its engine. It is considered as a core of the suite and is primarily used for executing and monitoring of business processes or cases and activities within them.This is a point where task assignment takes place too. Therefore, the work of this thesis is focused on the core engine and working with it. From many implementations on the market, jBPM engine [1] has been chosen for this thesis. One of the reasons for this decision was that together with an optimization tool OptaPlanner [2], which is also used in the thesis, it is a part of KIE Group [18]. This group consists of “open source projects for business systems automation and management” [18] and their development is sponsored by Red Hat, Inc. Because of this group, an integration of these projects is made much more easier. jBPM engine itself is a part of jBPM, which is a BPM software suite. However, apart from being BPM suite primarily, the suite as well offers CM capabilities via Case Management API since the version 6.3. jBPM engine is both lightweight and extensible and is written purely in Java. It executes processes and cases using BPMN 2.0 specification. Other notable features are a support of JPA persistence and JTA transactions or a history logging support to monitor the execution. The current version of the whole suite together with the engine is 6.5 while the latest unstable version is 7.0.0-SNAPSHOT. jBPM is provided for free without paid support, but it is possible to get paid support by Red Hat as a part of Red Hat JBoss BPM Suite [19].

21 3. jBPM Engine 3.2 Implementation of Case Management in jBPM Engine

Since jBPM engine does not support CMMN specification, the support of CM has to be implemented using BPMN 2.0, which is supported by the engine. And because the engine is very flexible, it is not a problem. There are two main parts which are used to make CM possible within BPMN specification. First part is an ad-hoc process and the second part are extension elements. Both these terms are already specified in BPMN so their usage is straightforward. Ad-hoc processes are business processes which do not have a se- quence of activities. This means that the order of the activities is either unspecified or unknown. Therefore, they are supposed to mirror cases inside BPMN. Ad-hoc process consists of activities predefined at de- sign time which then are able to be executed in any order and any number of times. In addition, jBPM engine Case Management API is able to dynamically create ad-hoc tasks at runtime, so no definition of the task is needed at design time. This capability makes CM even more flexible. Theoretically, one can start an empty case at the beginning and create tasks later at runtime, when there is enough information about them. At the same time, since jBPM engine supports BPM too, a part of the case might have a defined flow and the other part might contain only task definitions which are executed just in unexpected situations throughout the execution. Ultimately, one can even call a business process from a case or vice versa. And since everything is modelled using BPMN notation, the whole abstraction is still very clear for the user. Then there are extension elements, which are a part of the BPMN specification to provide extensibility of its metamodel. By using these elements, the extended metamodel is still BPMN compliant. These elements are used in CM of jBPM engine to define case roles used at runtime. Apart from the roles themselves, the maximum cardinality of each role might be defined too. It simply defines the maximum number of users — case workers — which might be assigned to this role and its function is to maintain the number of role assignments within the borders defined by the user.

22 3. jBPM Engine

To sum up, it is obvious that implementation of both BPM and CM techniques in one engine and using one notation provide big benefits. The first benefit is that users are able to interact with case orbusi- ness process from each other. The second benefit is high performance because everything is executed in one engine. At last, because both techniques use the same notation, it is very easy to familiarize with the case or the business process which is being executed.

3.3 Task Assignment inside Case Management

Since this thesis focuses on task assignment, an explanation of how task assignment works in jBPM engine is needed. Although CMMN specification describes the concept of case roles, it does not exactly specify how these case roles should be connected to actual case work- ers. Therefore, every software solution might alter it on its own so it might be implemented differently for various solutions. Because jBPM engine has been primarily designed to support busi- ness processes, the task assignment in CM is very similar to assignment in BPM with small alterations. In case of BPM, the assignment of a human task is done through ActorId and GroupId input variables of each task. ActorId may contain multiple users while GroupId handles multiple groups of users. Groups of users can be used for example to group employees based on a department they work for. Assignment is then based on these users and groups. When the CM feature is used, this concept is slightly changed. In- stead of actual users and groups, those two input variables expect case roles which are defined as a part of a case definition. For example, the ActorId variable may contain value supplier or manager while GroupId may have values like managers, HR or IT. Actual assignment of users and groups to those case roles is then provided when the case is started or after it has been started. This is another example of CM flexibility and ad-hoc behaviour. Instead of specifying the assignments upfront, they are delayed as much as it is possible. Moreover, a user can specify a maximum cardinality of each case role; this is not supported with business processes.

23 3. jBPM Engine 3.4 Task Lifecycle

Every task in jBPM engine has its own lifecycle which consists of stages this task can be in. Because this lifecycle is rather complex and this thesis will use only a part of it, not all stages are described in this section. For further information please see the documentation [20]. When the task is created it is in the Created stage. After that, when the task has only one potential owner (there is only one user assigned to it) it automatically transfers to the Reserved stage. When there are multiple potential owners, it transfers only to the Ready stage. Task in the Ready stage may be claimed by any user which is among the potential owners; after that it transfers to the Reserved stage. At this point, only a user who claimed the task might start and complete it. This user is assigned to the actualOwner variable of the claimed task for further use. After starting the task, it proceeds to the InProgress stage and after completion of the task it transfers to the Completed stage. This is a basic use case which is used in this thesis. In addition, there are many more advanced use cases which jBPM engine supports. For example, the task may be delegated or forwarded to another user, revoked from the actualOwner, suspended and re- sumed or even stopped. When the task is marked as skippable, it may be even skipped, thus it will not be executed.

3.5 Alternatives

There are many competitors to jBPM, both on the BPM and CM mar- ket. While some companies, e.g. IBM, produce separate BPM and CM software suites, others, e.g. Camunda [21], try to group both capabil- ities into one suite. This section presents some of them and briefly describes their capabilities and possibly differences from jBPM.

3.5.1 IBM IBM currently has two separate suites: IBM Business Process Manager [22], which is used for BPM, and IBM Case Manager [23], used for CM. Both applications are commercial software, but they offer a free demo. Business Process Manager has multiple editions, each of them offering a different set of features for different price, so customers can choose

24 3. jBPM Engine

the most suitable edition for them and do not have to pay for extra features which they do not need. These extra features include a support for service-oriented architecture (SOA), built-in enterprise service bus (ESB), various integration adapters, transactions support etc. From the usability point of view [24], one of the key features is a support for mobile scenarios through a mobile application which is available for iOS. However, there is currently only a 3rd-party application for Android. Despite that, it is still better than jBPM, which has no official mobile application. On the other hand, one of the main drawbacks of this software might be a quite complex installation procedure which might require a help from an expert. The second application, Case Manager, supports CM via CMMN notation. Although it is a different notation from BPMN, the applica- tion has the ability to be very well integrated with Business Process Manager, since it comes from the same company. It supports various data formats including social media, documents, video, audio and even GPS data specifically. It allows to store this data in many types of repositories via CMIS (Content Management Interoperability Ser- vices) standard and its extendible simple interface design. The biggest drawback [25] may be the fact that it is a separate application which customers have to buy if they want to use CM, while competition tries to group both BPM and CM into one multi-purpose application and make it this way more cost-efficient.

3.5.2 Oracle

Oracle offers one complete business suite called Oracle Business Pro- cess Management Suite [26] which provides all tools needed for both BPM and CM. The suite, which comes only in one version, is paid but Oracle offers an evaluation copy of it. Unlike IBM suite, it lacks a mobile application out of the box, but offers a framework called Oracle ADF Mobile which simplifies development of user interface for mobile devices. Another useful feature are predefined process models, metrics, key performance indicators forming together a customizable pre-built solution which helps customers to start their business faster. And finally, because it is one of Oracle Fusion Middleware products, it provides a quite good integration with other Oracle Middleware so-

25 3. jBPM Engine

lutions. Its weak points might be very complex configuration [27] and poor documentation when it comes to solving configuration issues.

3.5.3 Camunda Camunda is much smaller company than IBM, Oracle or Red Hat. It currently employs just over 40 full-time employees [28], but its software is used by big companies like AT&T Inc., Deutsche Bahn (National German Rail) and T-Mobile Austria. This company offers both BPM and CM solution in one package called Camunda BPM. From all the presented solutions, Camunda BPM resembles jBPM the most. It is open source just like jBPM while IBM and Oracle are strictly closed source. As in the case of jBPM, it is free by default (Community edition) with an option to purchase enterprise support (Enterprise edition) [29], which includes for example future patches and integration with IBM and Oracle application servers. It promotes flexibility, ease of use and understandability and together with jBPM they are more focused on being user friendly and easy to configure.

26 4 OptaPlanner

4.1 Introduction

As said before, OptaPlanner has been chosen for the task assigning optimization because (as well as jBPM engine) it is a part of KIE Group, which provides benefits in terms of compatibility and interoperability between the projects in this Group. It is an open source project which is sponsored by Red Hat, Inc. As well as jBPM engine, it is written in Java, which makes the integration even easier. OptaPlanner is “a lightweight, embeddable engine which optimizes planning problems” [30]. Some of the use cases which it solves include

∙ Employee rostering: timetabling, task assignment, ...

∙ Agenda scheduling: scheduling meetings, appointments, ...

∙ Vehicle routing: planning vehicles (trucks, trains, boats, air- planes, ...) with freight and/or people

∙ Bin packing: filling containers, trucks, ships and storage ware- houses, but also cloud computers nodes, ...

and many more. Most of these planning problems are NP-complete or even harder, e.g. the bin packing optimization problem is even NP-hard [31]. In short, this means there is no known polynomial algorithm which can solve these problems. It also means that the optimal solution cannot be determined in a feasible time, e.g. within hours, so organizations cannot plan their businesses based on its outcome. Moreover, with real world data, calculations would take even months or years, which is totally unacceptable for this scenario. According to the documentation [30], OptaPlanner does find a good solution in reasonable time for such planning problems by using advanced optimization algorithms. Internally, it combines optimization heuristics and metaheuristics with very efficient score calculation. The most important part ofitare metaheuristics which are a primary subfield of stochastic optimization

27 4. OptaPlanner

[32]. Stochastic optimization is a class of algorithms and techniques which utilize some degree of randomness to find optimal (or as opti- mal as possible) solutions to hard problems. Metaheuristics are the most general of these kinds of algorithms andy might be applied to a very wide range of problems. In fact, this is the main advantage of metaheuristics, because they do not need a lot of tuning for every use case, just a small number of parameters. A lot of these algorithms are provided with a proofs that given enough time, they converge to a global optimum [33]. In real use cases this is not always needed and the near-optimal solution is considered as feasible. It is up to a user or an organization to find a good compromise between time of the calculation and precision of the result.

4.2 Constraints and Score Calculation

4.2.1 Constraints Usually, every planning problem has at least two levels of constraints [30]: ∙ A negative hard constraint. This constraint must not be broken. For example: 1 teacher can not teach 2 different lessons at the same time. ∙ A negative soft constraint. This constraint should not be broken if it can be avoided. For example: Teacher A does not like to teach on Friday afternoon. Sometimes the problem might have a positive constraint too: ∙ A positive soft constraint, sometimes called a reward. This con- straint should be fulfilled if possible. For example: Teacher B likes to teach on Monday morning. These constraints define the score calculation of a planning prob- lem. Each solution of the planning problem can be graded with a score. Based on this score OptaPlanner compares found solutions during the solving and decides which one is better. OptaPlanner documen- tation lists several categories of solutions based on the constraints satisfaction:

28 4. OptaPlanner

∙ A possible solution is any solution, whether or not it breaks any number of constraints. Planning problems tend to have a very large number of possible solutions, but many of those solutions are worthless.

∙ A feasible solution is a solution that does not break any negative hard constraints. The number of feasible solutions tends to be relative to the number of possible solutions. Sometimes there are no feasible solutions. Every feasible solution is a possible solution.

∙ An optimal solution is a solution with the highest score. Plan- ning problems tend to have 1 or only a few optimal solutions. There is always at least 1 optimal solution, even in the case that there are no feasible solutions.

∙ The best solution found is a solution with the highest score found by an implementation in a given amount of time. This solution is likely to be feasible and, given enough time, it’s an optimal solution.

During solving, OptaPlanner is finding the best solutions using the configured (s). It is impossible to tell which heuristic is the best for each use case. To help a user decide which metaheuris- tic should be used, OptaPlanner provides a benchmarking, which is further covered in the chapter7.

4.2.2 Score Calculation Score is a way to provide OptaPlanner with information which so- lution is better than the other. Score and its calculation is the most important part of the solver configuration. When the score calculation is configured incorrectly, whole solution is useless.

Score Techniques There are several score techniques which might be used when the score calculation is being implemented:

29 4. OptaPlanner

∙ Score signum (positive or negative) is used to maximize or min- imize a constraint type. When a negative constraint is broken, a number representing this constraint should be subtracted from the score. When a positive constraint is fulfilled a number should be added to the score.

∙ Score weight is often used when breaking one constraint is as bad as breaking other constraint X times, then those two con- straints have different weights, but they are still on the same score level. Score weighing is only suitable in use cases where everything may be quantified.

∙ Score level (hard, medium, soft, ...) is used to prioritize a group of constraint types. A typical use case for score levels is a sit- uation where one constraint outranks the other constraint, no matter how many times the other is broken. For example, one teacher cannot teach two lectures at the same time (hard level constraint type), this outranks teacher preference score (soft level constraint type). The levels of two different scores are com- pared lexicographically, i.e. score that breaks 0 hard and 1000 soft constraints is better than score which breaks 1 hard and 0 soft constraints.

∙ Pareto scoring. This scoring is rarely used and, therefore, it is not explained in this thesis. Please see a documentation for more information.

Score Calculation Types The calculation itself may be implemented in three different ways:

∙ Easy Java score calculation is the easiest type of calculation. It consists of implementing a single Java method which is run every time the calculation for a solution is needed. Although it is easy to implement, it is slower and less scalable than other two types because it does not support incremental calculation. This means when the solution is changed, the entire score is recalculated again.

30 4. OptaPlanner

∙ Incremental Java score calculation consists of implementing multiple Java methods. It supports incremental score calculation, and, according to the documentation, it is currently the fastest when implemented correctly. The main disadvantage is that it is very difficult to write because all optimizations have to bedone by a user and since the code gets very complex, it is difficult to maintain if the constraints change over time.

score calculation consists of implementing one or more score calculation rules which are then run by Drools rule engine. This engine is a part of KIE Group too and it is seamlessly inte- grated with OptaPlanner. This calculation type is recommended by the documentation. The main advantage is the support of incremental calculation without any extra code because this is a responsibility of Drools engine which does it automatically. Be- cause these rules are separated from the code, they can be easily added, modified or even removed. The last notable advantage is performance optimization since Drools engine tends to become faster with every new version. There is basically only one dis- advantage and that is the knowledge of Drools Rule Language (DRL) which rules are written in. However, it is not needed to know DRL thoroughly to write score rules for most of the use cases.

4.3 Solver Lifecycle

When the solver is started, it runs the predefined sequence of actions which are described in this section. Whole overview of these actions is presented in the figure 4.1.

4.3.1 Solver Phase OptaPlanner’s solving consists of one or more phases. The most usual configuration is one phase for construction heuristic, which initializes a solution and one phase for local search, which represents the con- figured implementation of a metaheuristic (see the figure 4.2). Local search starts where the construction heuristic has stopped and tries to further improve the score of the solution. Phase of the construction

31 4. OptaPlanner

Figure 4.1: Scope Overview Source: OptaPlanner Documentation [30] heuristic terminates when the solution is initialized, although this might produce an infeasible score. Local search phase may terminate after different criteria are met. Usually it terminates after a defined amount of time has passed or after a certain number of iterations has been made. At last, the solver might be terminated asynchronously from another thread, i.e. when a user decides that the current best solution is good enough. Further types of termination are described in the documentation.

4.3.2 Steps and Moves Each solver phase consists of multiple steps while each step consists of multiple moves. Move represents a change or a set of changes between two solutions. An example of such a move is presented in the figure

32 4. OptaPlanner

Figure 4.2: An Example of OptaPlanner Phase Sequence Source: OptaPlanner Documentation [30]

4.3 which displays the N Queens problem1. In this figure a move changes the queen C from row 0 to row 2. OptaPlanner generates a lot of moves within one step and for each of the move the score is calculated. Then it picks the best (with the highest score) accepted move which is determined by the configured local search, thus ending the current step. In case of two or more moves with the same score, the winning move is picked randomly. After the step is ended, the next step is started as shown in figure 4.4. Therefore, the step may be defined as the winning move.

1. Place n queens on a chessboard of size nxn so that no 2 queens can attack each other.

33 4. OptaPlanner

Figure 4.3: Move Source: OptaPlanner Documentation [30]

4.4 Construction Heuristic

The purpose of the construction heuristic is to initialize the solution in a finite amount of time with the highest score possible. After the construction heuristic finishes initialization, local search is started as the second phase of a solver run. There are many construction heuristics which OptaPlanner supports, but only two of them which are mostly used will be briefly described. These two heuristics are compared in the benchmark chapter7 too. The first one is the First Fit heuristic. When run, it iterates overall planning entities, e.g. tasks that need to be assigned, in default order and initializes one planning entity at a time with the best available planning value, e.g. a user whom this task should be assigned to. The best available planning value is determined by a score calculation. It also takes the already initialized planning entities into account. After the planning entity has been assigned, it is not changed. The second heuristic is First Fit Decreasing. It is very similar to the First Fit with one difference; it starts with the initialization of the most difficult entities first and continues with less and less difficult entities. The idea is that more difficult entities are more difficult to assignat later stages so they need to be assigned as soon as possible. In some cases this heuristic may provide better results, but that is not always the case. In case of the mentioned N Queens problem, the most difficult queens to be placed are in the middle of the chessboard because they may threaten more queens. The further the queen is from the centre of the chessboard, the less difficult it is. However, the comparative function might sometimes be more difficult to implement.

34 4. OptaPlanner

Figure 4.4: Steps Source: OptaPlanner Documentation [30]

4.5 Local Search

Local search is the metaheuristic which is used for finding the optimal solution after the construction heuristic has finished its phase. As with the construction heuristic, OptaPlanner defines many types of local searches. However, only four of them are briefly described since it is not the primary focus of the thesis. These local searches are compared in the benchmark chapter7 as well.

35 4. OptaPlanner

4.5.1 Hill Climbing (Simple Local Search)

Hill Climbing, sometimes called Simple Local Search, is the most basic local search provided by OptaPlanner. It tries all generated moves and then takes the best move, since there is a high probability that it leads to a solution with the highest score. From that new solution, it again tries the generated moves and takes the best move and continues so on. If there are multiple moves with the same score, one is picked randomly. The main disadvantage of this approach is that Hill Climbing might get stuck in a local optimum. This usually happens when all possible moves degrade the score. If one of them is picked, the next move might go back to the previously picked solution thus the computation is not moving towards the better solution. An example of Hill Climbing is presented in the figure 4.5. Further types of local searches address this problem by additional techniques.

Figure 4.5: Hill Climbing Source: OptaPlanner Documentation [30]

36 4. OptaPlanner

4.5.2 Tabu Search

Tabu Search is similar to Hill Climbing, but it manages a tabu list to avoid getting stuck in local optima. The tabu list contains recently used objects that are tabu to use in the moves. This means that moves which involves these objects are not accepted. The tabu objects may be anything connected with the move, such as planning entities, planning values, moves or even a solution. Tabu lists of different objects might even be combined. An example of Tabu Search is shown in the figure 4.6.

Figure 4.6: Tabu Search Source: OptaPlanner Documentation [30]

37 4. OptaPlanner

4.5.3 Simulated Annealing

Simulated Annealing evaluates only a small number of moves per step. In the basic implementation, the first accepted move is the winning step. A move is accepted if it does not decrease the score or, in case it decreases the score, it passes a random check. The chance that this move passes a random check decreases relative to the score decrement and the time the phase has been running, which is simulated as a temperature. Therefore the name Simulated Annealing. From the beginning, it may pick non-improving moves too, but as the time passes (the temperature is decreasing), the chance of picking a non- improving move decreases too. Therefore, it acts as Hill Climbing in the end, only accepting improving moves. See the figure 4.7 for the illustration of the computation.

4.5.4 Late Acceptance

Late Acceptance also evaluates only a small number of moves per step. A move is accepted if it does not deteriorate the score or if it leads to a score that is at least as good as the late score, which is defined as the winning score from a fixed number of steps ago. This metaheuristic is presented in the figure 4.8.

4.6 Alternatives

The field of optimization software is very wide nowadays and almost every software has completely different approach to the optimization techniques. While many of them are proprietary, e.g. LocalSolver [34], [35] and IBM ILOG CPLEX Optimization Studio [36], there exist open source applications too, e.g. Google Optimization Tools [37], Choco [38] or COIN-OR project [39]. This section presents three of them (LocalSolver, IBM ILOG CPLEX Optimization Studio, Google Optimization Tools) as an alternative to the OptaPlanner; mostly listing their differences and techniques which are used for the optimization.

38 4. OptaPlanner

Figure 4.7: Simulated Annealing Source: OptaPlanner Documentation [30]

4.6.1 LocalSolver

LocalSolver is a proprietary optimization software which is described as a hybrid mathematical programming solver. It combines differ- ent optimization techniques including local search (which OptaPlan- ner uses too), constraint propagation and inference techniques as well as linear and non- techniques. This is the only optimization software described in the thesis which resembles OptaPlanner since it uses local search too. Others do not use local search metaheuristics at all. Unlike OptaPlanner, it comes with its own modelling language called LSP (LocalSolver Programming language) which is used for quick prototyping of the optimization problems. Al- though LocalSolver itself is implemented in C++, it provides a variety

39 4. OptaPlanner

Figure 4.8: Late Acceptance Source: OptaPlanner Documentation [30]

of APIs in different languages (Python, C++, Java, C#) as well; sothe knowledge of LSP is not needed and its integration within existing application should be easy. Although it is a proprietary software, it provides free trial and academic licenses too.

4.6.2 IBM ILOG CPLEX Optimization Studio This proprietary software suite, made by IBM, has two parts which differs in the optimization techniques they use. The first part, called IBM ILOG CPLEX Optimizer, uses linear programming, mixed integer programming and non-linear programming techniques. The second part, IBM ILOG CPLEX CP Optimizer, uses techniques from constraint programming and it is more focused on combinatorial optimization

40 4. OptaPlanner problems which cannot be solved by traditional mathematical pro- gramming methods. Clearly, this suite does not implement any kind of metaheuristic like local search. It is implemented in C language, but offers APIs for C++, C#, Java, Python and many more. Moreover, connectors for Microsoft Excel and MATLAB are provided too. As LocalSolver, it comes with its own modelling language called OML (Optimization Modelling Language) which may be used within its own integrated development environment. It provides free trial and academic licenses like LocalSolver.

4.6.3 Google Optimization Tools Last presented optimization software suite, Google Optimization Tools, is free and open source tool from Google which is actively used by Google company for mission-critical applications, so it is actively developed and updated. Google describes it as a combinatorial opti- mization solver which focuses on constraint programming techniques, knapsack algorithms, graph algorithms and linear programming as well. Being it open source, it provides quite good documentation and tutorials for free [37], which describe how to use this suite for vehi- cle routing problems, bin packing, travelling salesman problems and scheduling problems. It is implemented in C++ but provides interfaces in Python, Java, and C# as well. Although it does not use metaheuris- tics, it resembles OptaPlanner the most, since it is open source, its documentation is quite good and there is a community willing to help with issues that might be encountered.

41 5 Resource Management Module

This chapter focuses on the actual integration of Case Management (CM) in jBPM engine with OptaPlanner. At first, an initial idea about the task assignment is presented. This includes the way how users are inserted into jBPM engine, then how tasks are assigned to users and which criteria this task assignment is based on. After that, the actual implementation is described. That means which mechanisms of jBPM engine have been used, which attributes the tasks have and how OptaPlanner has been configured to correctly assign tasks to different users based on the mentioned criteria.

5.1 Analysis

The description of this thesis does not define task assignment and its criteria exactly so they have to be chosen before the actual implemen- tation starts; this section presents such information.

5.1.1 Users Users represent entities which tasks are assigned to. These users will be loaded from a database whenever jBPM engine requests this in- formation. As presented in the section 3.3, user may belong to one or more groups, which simplifies the assignment of multiple users to one task. In addition, every users will have skill levels so a duration of a task can be estimated. Naturally, the more skilled the user is, the less time the task takes. For simplicity and to ensure that any user or group might be assigned to any case role when starting a case, all users will have all skills, but, naturally, with different skill levels. This is quite common in real world too, since every skill might be evaluated for each person. This information will be saved in a database too.

5.1.2 Tasks Tasks will be assigned to users by defined case role assignments when cases with tasks are started. So if the task A is for case role manager only users which are in that case role for the particular case are eligible

42 5. Resource Management Module

to claim, start and complete the task A. This is basically the only hard constraint which should hold. Other than that, tasks will have another parameters like priority, needed skill and base duration. In case of priority, tasks with higher priorities should be placed before tasks with lower priorities. Needed skill is simply a skill which is needed while a user is working on the task and base duration will serve as an unadjusted duration of the task before a user’s skill level is applied. At last, tasks will have their start and end times. This will be used to compute how much work a particular user has and OptaPlanner will use this information to distribute tasks evenly while ensuring no hard constraints are broken.

5.2 Configuration of jBPM Engine and its Domain

This section explains what configuration of jBPM engine was needed and what domain classes have been implemented to allow jBPM en- gine to automatically determine which user or group is a potential owner of a particular task.

5.2.1 Persistence and Persistence entities Since jBPM engine uses JTA1 transactions and it is not run inside Java EE container, a JTA transaction manager configuration was needed. Bitronix Transaction Manager [40] has been chosen since it is usually used in community and it is even used in the engine’s documentation. To easily map entity objects to a database and vice versa Hibernate framework [41] has been chosen as a JPA2 provider. Again, the reason is that it is usually used in community with jBPM engine. All persistence entities are shown in the figure 5.1. Names of the entities are suffixed with Entity to easily distinguish them from some OptaPlanner domain objects, e.g. UserEntity vs User. During implementation H2 in-memory database has been used, however, the database may easily be changed for another one since Hibernate provider is database agnostic.

1. Java Transaction API - enables advanced transaction handling including dis- tributed transactions 2. Java Persistence API - object-relational mapping API

43 5. Resource Management Module

Figure 5.1: Persistence Entities Source: Author

Entity UserEntity

UserEntity class is the entity which represents a user. Its main at- tributes are name, userGroups and userSkills. Attribute name is simply a name of a user and it has a unique constraint since two users with one name should be forbidden. Collection attributes userSkills of type User- SkillEntity and userGroups of type UserGroupEntity represent user’s skills with skill levels and groups which a user belongs to.

Entities SkillEntity and UserSkillEntity

SkillEntity is a class that represents a skill which is then assigned to a user via association entity UserSkillEntity. SkillEntity has a simple name attribute and collection attribute userSkills of type UserSkillEntity which is a reference to the mentioned association entity. UserSkillEntity is an association class which “connects” a particular user (UserEntity) with a particular skill (SkillEntity). A skill level of this user for that skill is described by an additional skillLevel attribute. The “connection” itself is provided by a compound primary key UserSkillId which consists of a UserEntity and a SkillEntity. Therefore, there is a many-to-many association between UserEntity and SkillEntity.

44 5. Resource Management Module

Entities GroupEntity and UserGroupEntity

Groups are handled similarly as skills. GroupEntity is an entity which represent a group of users. Its main attributes are name represent- ing a name of the group and userGroups which is a collection with association entities of type UserGroupEntity. UserGroupEntity is an association class which associates a UserEn- tity with a GroupEntity. Unlike the skill, there is no other attribute than a compound primary key of type UserGroupId, but this association class is provided in case it needs to be extended in the future. The compound primary key UserGroupId is similar to UserSkillId, thus it consists of a UserEntity and a GroupEntity. Again, it is a many-to-many association.

5.2.2 Inserting and Retrieving Users to/from DB

To easily insert users to a database and manage their groups and skills, a UserJSONParser has been implemented. An example of a JSON file that this parser supports is shown in the figure 5.2.

Figure 5.2: Users JSON Source: Author

45 5. Resource Management Module

For easy retrieval of users from the database, a UserServiceUtil class has been created. It provides methods for retrieval of all users from the database, which are then used as planning values for OptaPlanner. Methods for loading group names and skill names are also there but they are used only in the prototype application, see chapter6. jBPM engine may automatically retrieve users and groups from the database when configured to do so. To enable it,a DBUserGroup- Callback must be registered and SQL queries must be provided which are then called when the engine checks if a user or a group exists and which users belong to a particular group. This database callback is selected by providing a parameter org..ht.callback=db from a com- mand line when the engine is being started and by passing it as an argument when jBPM engine’s TaskService instance is being created.

5.2.3 Listener for Pushing Tasks from jBPM Engine Tasks which are created in jBPM engine may be either pulled from a database or pushed by the engine via a listener. For this thesis the push model has been chosen since it saves resources and possibly provides better performance. Whenever there is a new task created it is simply pushed to a working memory, i.e. a collection, for further use so there is no need for a consumer to periodically retrieve new tasks. In fact two listeners have been created, one for the prototype application and one for the tests. The only difference is that the application listener also updates a GUI component responsible for solving. When a task is created and added inside jBPM engine, the listener is triggered and a task is mapped from jBPM Task type to TaskPlan- ningEntity type which is used by OptaPlanner, see section 5.3. After the task is claimed or started, only a task’s status is changed. When the task is completed, it is just removed from the collection. For the prototype application, there is another one method implemented in case the task is exited, which means the case this task belongs to has been cancelled. This happens when a user loads another scenario in application. Listener is registered when the case definitions are deployed inside the engine. Registration itself is done by MVEL3 in a deployment

3. MVFLEX Expression Language

46 5. Resource Management Module

descriptor. However, this brings one issue. Listener is created using listener’s constructor but parameters may only be constants and static members since it is not possible to reference a Java object from MVEL inside this deployment descriptor. Therefore a collection to which this listener pushes tasks has to be a static member which is not a very clean solution but it is the only one that works. This has been consulted with jBPM developers too and they also recommended this approach. In case of an application listener, there is a wrapper class named PushTaskEventListenerFactory through which this listener may be obtained in running code. This is the approach recommended by developers as well.

5.3 Configuration of OptaPlanner and its Domain

This part describes OptaPlanner domain entities and classes which are used for solving the task assignment problem. Moreover, configuration of OptaPlanner and its score calculation rules are presented too.

5.3.1 Planning Entities, Planning Values and their Relationships The main hierarchy of classes used in the planner consists of TaskOrUser, TaskPlanningEntity and User classes. Relationships be- tween them shows figure 5.3. It is an example of Chained Through Time Pattern which may be found in the documentation of Opta- Planner and it is usually used for task assignment or vehicle routing problems. OptaPlanner is configured for chained planning variable which means: ∙ User is the anchor/head of each chain ∙ TaskPlanningEntity objects are the other elements in the chain ∙ Chain is open at the end, so it cannot contain a cycle ∙ Each TaskPlanningEntity object points to a previous element, this pointer variable is a planning variable (@PlanningVariable) ∙ Each element in the chain points to the next element, in this case it is the next TaskPlanningEntity or null if the element is last (@InverseRelationShadowVariable)

47 5. Resource Management Module

∙ Each TaskPlanningEntity has an anchor variable so it is not needed to traverse whole chain to get an anchor (@AnchorShad- owVariable)

Figure 5.3: OptaPlanner Relationships Source: Author

The figure 5.4 shows which chains are possible and which are not. TaskOrUser class contains only nextTask variable which is an @In- verseRelationShadowVariable. It is created only to take advantage of Java polymorphism so a TaskPlanningEntity may point to either a User or another TaskPlanningEntity. User entity is a domain class for a user which is used by Opta- Planner. When a user is obtained from a database as a UserEntity it is transformed to a User object. So it only has attributes like name, groups and skills. TaskPlanningEntity is the most important class of this hierarchy since it is a planning entity. It contains three OptaPlanner variables: previousTaskOrUser, user and startTime. previousTaskOrUser is a @Plan- ningVariable. Its values may be User and TaskPlanningEntity objects. OptaPlanner automatically assigns these values so that no constraint presented for the chained variable is broken. user variable is inserted automatically and it acts as an @AnchorShadowVariable. startTime variable is a @CustomShadowVariable and it represents a starting time of the tasks. It is recomputed every time a @PlanningVariable changes. A listener which recomputes this variable is called StartTime- UpdatingVariableListener. It simply takes the end time of the previous

48 5. Resource Management Module

Figure 5.4: Examples of Chains Source: OptaPlanner Documentation [30]

task as a sum of its start time and real duration and assigns this sum to the startTime variable of the next task. Real duration is determined by a skill level of a user which is assigned to this task. There are four skill levels - NONE, BEGINNER, ADVANCED and EXPERT which are represented by the enum class SkillLevel. NONE skill level is only used in a situation where no user is assigned for the task. Moreover, each skill level has a duration multiplier specified which determines the real duration of the task by multiplying baseDuration variable. Du- ration multipliers are defined as follows: NONE - 4, BEGINNER - 3, ADVANCED - 2 and EXPERT - 1. This means that the baseDuration variable of the TaskPlanningEntity represents the most optimistic du- ration.

49 5. Resource Management Module

TaskPlanningEntity has other attributes too. These attributes are represented by the table 5.1.

Variable name Description actualOwner populated with the user who claimed the task, otherwise null potentialUserOwners collection of users who may claim the task potentialGroupOwners collection of groups whose users may claim the task baseDuration base duration of the task in minutes status status of the task, possible values are Ready, Reserved, InProgress and Completed priority priority of the task, possible values are LOW, MEDIUM, HIGH skill skill which is needed for this task inputVariables collection of all input variables of the task, might be useful in case more task param- eters are used

Table 5.1: TaskPlanningEntity attributes

5.3.2 TaskAssigningSolution TaskAssigningSolution class represents a solution which is presented to OptaPlanner for solving. It contains a list of tasks, a list of users and a type of score which is used during solving. The score is of type BendableScore. This means that a number of score levels is bendable when OptaPlanner configuration is provided. In this example, there is one hard level which is used to check if every task is assigned to an eligible user. Then there are four soft levels. The first soft level is for high-priority tasks. The second one is for load balancing of tasks between users. The third soft level is for medium-priority tasks and the last fourth level is for low-priority tasks. For the calculation of score

50 5. Resource Management Module

the Drools score calculation has been chosen because of its simplicity and good performance. Further information about levels and rules is provided in section 5.3.3.

5.3.3 Score rules Hard Constraint Rule The first rule in the figure 5.5 called Role requirements evaluates if ev- ery task is assigned to a right user, thus no hard constraint is broken. To check that, a TaskPlanningEntity’s method isAssignmentFeasible() is called. This method checks if the assigned user is either among po- tential owners of this task or one of the user’s groups is among task’s potential group owners. In case the task is in Reserved or InProgress stages, the actualOwner is not null and it must match the assigned user. If the constraint is broken, the method returns false, thus the hard level part of score is decreased by one to indicate that one task is not assigned correctly. As said in chapter4, OptaPlanner maximizes the score, therefore, there is a negative sign added.

Figure 5.5: Role Requirements Score Rule Source: Author

Soft Constraint Rules The first soft constraint rule, called High priority, ensures that high- priority tasks are planned as first. It iterates over all assigned tasks with high priority and produces a negative sum of their end times. It is clear that the more high-priority tasks are in the front, the higher number this negative sum produces (it is closer to 0), thus OptaPlanner prioritize them before everything else. This rule is shown in the figure 5.6. The second soft constraint rule is based on a recommenda- tion provided in OptaPlanner documentation. It uses a so called

51 5. Resource Management Module

Figure 5.6: High Priority Score Rule Source: Author

squared workload implementation. This means that it iterates over last tasks in the chains and for each of them a squared end time is sub- tracted. According to the documentation, this provides good com- promise between load balancing and minimizing makespan of the tasks. This means that if there is a number of tasks with the same difficulty (duration is the same) for each of two users, these taskswill be distributed evenly between them. However, if these tasks take more time for one of the users, then this rule tries to assign more tasks to the user who completes them quicker so the makespan is minimized. Moreover, a user who completes them slower is not so occupied with work he/she does slower and this user’s capacity is free for other tasks which suit him/her more. In general use cases, at least some of the tasks would be assigned to the less experienced worker as well. This rule is shown in the figure 5.7. The other possibility would be to compute variance or standard deviation, but as listed in the doc- umentation, it is more difficult to implement and rarely brings any benefits.

Figure 5.7: Minimize Makespan Score Rule Source: Author

The last two rules are similar to the High priority rule and they simply computes the score for tasks with the medium and low priority. The order for the score calculation is the same as the order of presented rules. This means that, at first, OptaPlanner ensures that no hard constraints are broken. After that it focuses on high-priority tasks,

52 5. Resource Management Module

then on minimizing the makespan and finally on medium and low- priority tasks. This order can be reconfigured quite easily by changing the score index which the particular rule adds a value to.

53 6 Prototype Application

6.1 Purpose and Used Technologies

A purpose of the prototype application is to easily visualize how plan- ner works and how it handles different criteria presented by score rules. Application is implemented using JavaFX [42] technology. JavaFX technology is a native Java platform for creating graphical desktop applications and it is a successor to the well-known Swing [43]. In the future, it will completely replace Swing as a part of the Java SE specification. It provides many pre-built components which makes it suitable for quick prototyping and it has a lot of improvements over Swing toolkit too.

6.2 Implementation

In this section an implementation is only briefly described since the application itself is not the main part of the thesis. JavaFX platform uses the MVC1 [44] pattern so it clearly separates domain (model), graphical interface (view) and application logic (controller). In case of JavaFX, a graphical interface is defined by FXML files which are XML files that use FXML language. This view representation is then connected with application logic by specifying a controller which is a Java class providing application logic such as button events, updating labels or displaying dialog windows in case there is some validation error. The application is run from within MainApp class and all communication between controllers is provided through it.

6.2.1 RootLayout and SolutionController The main FXML file is called RootLayout and provides the ba- sic layout for the whole application. Controller for this layout is SolutionController which handles all planning performed by Opta- Planner and updates the task overview in case there is a new best solution found for the task assignment. This task overview is visual-

1. Model-View-Controller

54 6. Prototype Application

ized by the class TaskAnchorPane which is a customized version of the standard AnchorPane provided in JavaFX. Naturally, since there is no pre-defined component for task assignment, a custom one had tobe implemented as a Java class and thus it does not have an associated FXML file.

6.2.2 CaseOverview and CaseController CaseOverview FXML is used for loading scenarios and thus creating tasks and inserting users into a database. Moreover, once the scenario is loaded, a new custom task can be created. More information about these scenarios is presented in the section 6.3. CaseController serves as a controller for the overview and it uses Services class, which encap- sulates all services needed to run jBPM engine and inserting users. This includes starting cases, which produces tasks, cancelling cases when a new scenario is being loaded, parsing JSON files with users and inserting them into a database.

6.2.3 Dialogs and Dialog Controllers Finally, the application uses dialogs and dialogs controllers to either just display the information or to allow a user to insert some data. CustomTaskDialog FXML and CustomTaskDialogController pair handles a creation of a new task once a scenario is loaded. TaskDialog FXML together with TaskDialogController visualize task’s information once the task is clicked. This dialog also provides a way to claim this task and complete it. At last, UserDialog FXML and UserDialogController are used to show user’s information like skill levels or groups this user belongs to.

6.3 Scenarios

The application provides three scenarios, Small, Medium and Large, which are used for convenience so a user does not need to manually create a lot of tasks in the beginning. This section describes these sce- narios in terms of a number of tasks and users and presents constraints like needed skill, base duration or case roles of the tasks.

55 6. Prototype Application

6.3.1 Small Scenario Small scenario consists of 5 users and 40 tasks. Users, their skill lev- els and their groups may be found in the file users.json. Tasks are created by starting two case definitions, namely TasksToAssign.bpmn2 and TasksToAssign2.bpmn2. Each of them consists of 2 tasks. First case definition contains tasks Accept Order and Deliver Goods while the sec- ond case definition has tasks Review Docs and Preview Docs. Every case definition is started 10 times so in the end, there are 40 tasks created. Table 6.1 shows task details. Duration is shown in minutes and group case roles are presented in plural form, e.g. supplier/suppliers.

Task name Skill needed Case roles Duration Priority Accept Order management manager 30 LOW Deliver Goods delivering supplier, 60 MEDIUM suppliers Review Docs management managers 45 LOW Prepare Docs administration officeEm- 40 HIGH ployees

Table 6.1: Small scenario tasks

6.3.2 Medium Scenario Medium scenario has 10 users and 70 tasks. It contains all users and tasks from Small scenario and adds other 5 users and 30 tasks on top of that. These 5 users are defined in the file users2.json while tasks are cre- ated by starting one additional case definition, TasksToAssign3.bpmn2, 10 times. The case definition includes 3 tasks which are presented in the table 6.2.

6.3.3 Large Scenario Large scenario includes 20 users and 150 tasks. It again includes all users and tasks from Medium scenario. In addition, it provides 10 other users which are specified in the file users3.json. In terms of tasks, a new case definition, namely TasksToAssign4.bpmn2, with 4 tasks is

56 6. Prototype Application

Task name Skill needed Case roles Duration Priority Consult management managers 50 MEDIUM Contract Calculate calculation economists 55 HIGH Offer Approve riskManagement headManager 30 MEDIUM Contract

Table 6.2: Medium scenario tasks

started 20 times which means that the Large scenario contains 150 tasks. Details of these tasks are displayed in the table 6.3.

Task name Skill needed Case roles Duration Priority Improve publicRelations marketers 60 LOW PR Marketing presentation marketers 90 MEDIUM Presenta- tion Design creativitiy designers 60 MEDIUM Logo Organize organization organizers 70 LOW Confer- ence

Table 6.3: Large scenario tasks

6.4 User Guide

This section presents a brief user guide for the prototype application. Application is started by completing following steps:

1. The source code may be downloaded from this GitHub link: https://github.com/MarianMacik/jbpm-optaplanner-taskassignment

57 6. Prototype Application

2. Using Maven, run the following command from the root direc- tory of the project (a directory with pom.xml file): mvn clean install -DskipTests

3. Using standard Java command, start the application by running: java -Dorg.jbpm.ht.callback=db -jar ./target/taskassignment-1.0.0- SNAPSHOT.jar

After the application is started, a user can load one of the three scenarios by clicking one of the three buttons on the left side of the application window. When a scenario is loaded, solving may be started by clicking the start button. Solving is shown either visually when user chains of tasks are populated or the application user might watch the improving score at the bottom which is refreshed when a new best solution is found. When the solving is stopped, the user is able to look at each task by clicking on it. Various information like task’s potential owners, groups or a base duration is shown when a dialog is displayed. It is also possible to claim or even complete the task from within the dialog. When the task is claimed, a locker icon is displayed on the task in task overview. After the task is completed, it is simply removed from the task list and a user may start solving again to see if there is another best solution of task assignment because of that completed task. Moreover, a new task might be created by clicking on the Create Custom Task button. From the dialog, the user is able to specify task name, skill which is needed, base duration ranging from 30 to 120 minutes, priority, potential users and potential groups of this task. Task name and at least one potential owner or group must be selected. At last, a number of tasks to create, ranging from 1 to 10, is specified. When the Create button is clicked, a new case based on the case def- inition OneTask.bpmn2 is started. This case definition contains only one task and all mentioned parameters are passed to this particular task. In case more than one custom task has to be created, the same case definition is started multiple times. After these tasks are created, they are uninitialized thus they are shown at the bottom of the task overview.

58 6. Prototype Application

The last feature of the application is a user info. By clicking on the user when the solving is stopped, the user information like user’s groups or skill levels are presented. The main window of the application is shown in the figure 6.1. Notice the colours of the tasks: red for high priority, yellow for medium priority and green for low priority. It can be clearly seen that high-priority tasks are first, then tasks with medium and low priorities follow. Moreover, real duration of the task is represented by its size. Time labels are provided at the top for convenience; one work day lasts 10 hours, from 8:00 to 18:00.

Figure 6.1: Prototype Application Source: Author

59 7 Benchmarks

7.1 OptaPlanner Benchmark

OptaPlanner provides automated benchmarking tool which is imple- mented in a module optaplanner-benchmark. It is very useful mostly when it is unclear what construction heuristic or metaheuristic should be used and for advanced performance tweaking once the best meta- heuristic is found. Its configuration is very simple and is done via an XML file sim- ilarly to the planner configuration. In addition to the basic planner configuration like which classes form a planner’s domain, a useralso has to specify solutions which are to be run and which heuristics they will be run with. There are optional parameters too, for example the duration of JVM1 runtime warm-up or type of the metric which the benchmark should focus on, e.g. best score, score calculation speed etc. After the test is finished, benchmark evaluates the results and automatically decides which tested heuristic is the best based on the chosen type of metric. It will also generate an HTML report with various graphs and tables. This destination directory is defined to be benchmarks directory which is created in the root directory of the project after the test is finished. Solution to benchmark may be in a lot of format types. The most common type is XML (usually read by XStream [45] or JAXB [46] parsers), but the solution may be directly read from a database too if it is saved there. Furthermore, an interface SolutionFileIO is provided so the user may implement own solution reader. Apart from reading, writing of the solved solutions is supported too, but this is usually used much less. Input solutions in XML format read by XStream parser have been chosen for this thesis. These solutions are identical to Small, Medium and Large scenarios used in prototype application. At first, construc- tion heuristics, namely First Fit and First Fit Decreasing have been tested. After that, local search metaheuristics presented in this thesis have been tested too. Test has been run for 1 minute with 30 seconds

1. Java Virtual Machine

60 7. Benchmarks

of JVM warm-up. Because this scenarios are not big, this time should be enough for benchmark to see which local search is the best, at least for this data. The specifications of the laptop on which these tests have been run are:

∙ CPU: Dual-core Intel Core i7-4600U @ 2.1 GHz, Max Turbo fre- quency 3.3 GHz

∙ RAM: 12 GB DDR3 @ 1600MHz

∙ Storage: SATA III SSD 256 GB

∙ Environment: Fedora 20, Java 8 Update 111, Maven 3.3.9

Tests may be started by running a separate Maven profile: mvn clean install -PBenchmarkTest

Note: The second level of soft score (minimize makespan) has been considered as the most diverse between different heuristics and in fact it most influenced the benchmark results.

7.2 Construction Heuristics Benchmark

Results of construction heuristics benchmark shows the table 7.1 (1st place is marked with 0, 2nd with 1). It is obvious that the better con- struction heuristic at least for this test data is the simple First Fit. For First Fit Decreasing, the TaskDifficultyComparator has been imple- mented. It compares tasks at first by their priority, then by their base duration and at last by their id, which is uniquely assigned from jBPM engine. Although in this example the First Fit Decreasing performs worse, it might be useful with different datasets.

7.3 Metaheuristics Benchmark

Metaheuristics benchmark results are presented by the benchmark table 7.2 (1st place starts with 0). It may be clearly seen that the best metaheuristics are Hill Climbing (Simple Local Search) and Tabu Search, which are followed by Late Acceptance and Simulated Anneal- ing. Simulated Annealing always performed much worse than other

61 7. Benchmarks

Figure 7.1: Results of Construction Heuristic Benchmarks Source: Author

metaheuristics during the tests, no matter how the initial temperature for this heuristic was set. This is the only metaheuristic which has a mandatory parameter. The reason for it is that it is still considered quite experimental and according to the OptaPlanner documentation it is very difficult to set general value which suits most of theuse cases. Other metaheuristics were run with default values of parame- ters which are documented in the documentation as well.

Figure 7.2: Results of Metaheuristics Benchmarks Source: Author

62 7. Benchmarks

Graph 7.3 shows how the second level score has been improving during one minute of solving (score levels are indexed from 0). It is clear that Hill Climbing together with Tabu Search find the best score (which is likely to be the optimal score because it is not improving further over one minute) much quicker than Late Acceptance meta- heuristic. In addition, Simulated Annealing does not seem to be a good heuristic for this use case since it is not improving over one minute, thus performing much worse than the other 3 heuristics.

Figure 7.3: Score Progress During Time Source: Author

63 8 Issues Experienced and Further Improve- ments

8.1 Issues Experienced during Implementation

This section sums issues which were experienced during the imple- mentation of either the prototype application or the module itself. Probably the biggest issue was the jBPM engine configuration which took a lot of time. The most of it was consumed by correctly registering users and groups callback, which had to be done via de- ployment descriptor of the deployable archive with case definitions and via created jBPM services as well, which was a little bit unintu- itive. Another similar issue was connected with PushTaskEventListener because it had to be configured with MVEL, thus it was not possible to pass a particular Java object to its constructor. The solution to this issue was to pass a static field which MVEL supports. Although this works, it is not a clean approach in terms of object-oriented programming. On the other hand, OptaPlanner configuration was more straight- forward, mostly because of the well-written documentation. The only bigger issue was with the shadow variable listener, but the root cause was quickly found thanks to meaningful exception messages of Opta- Planner - it was a wrong configuration of the shadow variable. In case of JavaFX and prototype application, there were not many issues, but since the author was familiar only with Swing environment, additional time was needed to learn this new platform. But in the end, it has to be said that JavaFX provides much more convenient and understandable concept of building desktop applications using Java than the older Swing platform.

8.2 Further Improvements

The thesis provides a good basis for further improvements. The biggest improvement would be real-time planning which means that Opta- Planner would not need to be interrupted from the user side when new tasks are created or their states are changed.

64 8. Issues Experienced and Further Improvements

Moreover, the amount of task parameters which are considered during the planning may be increased. For example, each user’s skill may be improved dynamically after a defined amount of tasks has been completed. This would probably reflect the reality even more and there would not be a need for an administrator to alter skill levels every couple of weeks. At last, the OptaPlanner configuration, mostly local search heuris- tics and their parameters, might be tweaked by running the solver on real world data. This would probably improve the performance, but this tweaking is very time consuming and it was beyond the scope of the thesis.

65 9 Conclusion

The main aim of this thesis was to implement a resource manage- ment module which would use jBPM engine and OptaPlanner for automated task assignment inside case management. The first part of the thesis focused on theoretical knowledge re- garding business process management and case management. Both similarities and differences between them were presented including CMMN and BPMN specifications. This was done in order to clearly differentiate these two techniques and notations since case manage- ment is rather new (initial specification from 2014 [3]) and it is often considered as a synonym to business process management. Next two parts presented tools whose integration was the main part of the thesis, i.e. jBPM engine and OptaPlanner. jBPM engine part focused on tasks and their lifecycle while OptaPlanner part mostly explained how this optimization software works and what techniques it uses. At last, each of them was compared with its alternatives. Subsequent part describes the analysis, design and implementation of resource management module and the prototype application. After that, benchmark results were commented and visualized by tables and a graph. The very last part of the thesis summarized most notable issues experienced during the implementation and listed possible improvements in the future. Implementation of the module and the prototype application has been successful and all requirements presented at the beginning of the thesis were fulfilled. It may serve as a good basis for further integration of jBPM and OptaPlanner in community.

66 A Zip Archive

The attached Zip archive contains the source code of the resource man- agement module and the prototype application as well as all resources like BPMN 2.0 case definitions and JSON files used to populate a database with users. The sources may be downloaded from GitHub as well at this link:

https://github.com/MarianMacik/jbpm-optaplanner-taskassignment

67 Bibliography

1. jBPM - Open Source Business Process Management - Process engine [on- line]. USA: KIE Group, 2016 [visited on 2016-11-14]. Available from: http://jbpm.org. 2. OptaPlanner - Constraint satisfaction solver (JavaTM, Open Source) [on- line]. USA: KIE Group, 2016 [visited on 2016-11-14]. Available from: http://www.optaplanner.org. 3. Case Management Model And Notation [online]. USA: Object Manage- ment Group, 2016 [visited on 2016-12-12]. Available from: http: //www.omg.org/spec/CMMN/. 4. Definition of Case Management [online]. USA: Case Management Society of America, 2016 [visited on 2016-08-09]. Available from: http:// www . cmsa . org / Home / CMSA / WhatisaCaseManager / tabid / 224 / Default.aspx. 5. Definition and Philosophy of Case Management [online]. USA: Commis- sion for Case Manager Certification, 2016 [visited on 2016-08-09]. Available from: https : / / ccmcertification . org / about - us / about - case - management / definition - and - philosophy - case - management. 6. What is the Difference Between Case Management and Traditional BPM? [online]. USA: Keith Swenson, Chairman of the Workflow Manage- ment Coalition, 2010 [visited on 2016-09-10]. Available from: https: / / social - biz . org / 2010 / 06 / 16 / what - is - the - difference - between-case-management-and-bpm/. 7. What is BPM? [online]. USA: Workflow Management Coalition, 2016 [visited on 2016-09-07]. Available from: http://wfmc.org/what- is-bpm. 8. What is BPM? [online]. USA: BPM.com, 2014 [visited on 2016-09-07]. Available from: http://bpm.com/what-is-bpm. 9. SCHEER, August-Wilhelm; SCHEEL, Henrik von; ROSING, Mark von. The Complete Business Process Handbook: Body of Knowledge from Pro- cess Modeling to BPM, Volume 1. Waltham, USA: Morgan Kaufmann, 2014. ISBN 9780127999593.

68 BIBLIOGRAPHY

10. About OMG [online]. USA: Object Management Group, 2015 [vis- ited on 2016-09-10]. Available from: http : / / www . omg . org / gettingstarted/gettingstartedindex.htm. 11. OMG Members [online]. USA: Object Management Group, 2016 [vis- ited on 2016-09-10]. Available from: http://www.omg.org/cgi- bin/apps/membersearch.pl. 12. Business Process Model And Notation [online]. USA: Object Management Group, 2016 [visited on 2016-12-12]. Available from: http://www. omg.org/spec/BPMN/. 13. The Free Medical Dictionary [online]. USA: Farlex, Inc, 2016 [visited on 2016-12-11]. Available from: http : / / medical - dictionary . thefreedictionary.com/case. 14. The Free Legal Dictionary [online]. USA: Farlex, Inc, 2016 [visited on 2016-12-11]. Available from: http : / / legal - dictionary . thefreedictionary.com/case. 15. Adaptive Case Management [online]. Germany: Zambrovski Simon, 2014 [visited on 2016-12-18]. Available from: http://de.slideshare. net/zambrovski/acm-camunda-slide-share. 16. DAVENPORT, Thomas. Process Innovation: Reengineering work through information technology. Boston, USA: Harvard Business Review Press, 1992. ISBN 978-0875843667. 17. HAMMER, Michael; CHAMPY, James. Reengineering the Corpora- tion: A Manifesto for Business Revolution (Collins Business Essentials). HarperBusiness, 2006. ISBN 0060559535. 18. KIE [online]. USA: KIE Group, 2016 [visited on 2016-11-14]. Available from: http://www.kiegroup.org. 19. Red Hat JBoss BPM Suite - business process management software [online]. USA: Red Hat, Inc., 2016 [visited on 2016-11-15]. Available from: https://www.redhat.com/en/technologies/jboss-middleware/ bpm. 20. jBPM - Open Source Business Process Management - Documentation [on- line]. USA: KIE Group, 2016 [visited on 2016-11-14]. Available from: http://jbpm.org/learn/documentation.html.

69 BIBLIOGRAPHY

21. Workflow Automation with Java and BPMN 2.0 [online]. USA: Camunda, 2016 [visited on 2016-11-18]. Available from: https://camunda.com. 22. IBM - Software - IBM Business Process Manager [online]. USA: IBM, 2016 [visited on 2016-11-18]. Available from: http://www-03.ibm.com/ software/products/en/business-process-manager-family. 23. IBM - Case management software - Case Manager [online]. USA: IBM, 2016 [visited on 2016-11-18]. Available from: http://www-03.ibm. com/software/products/en/casemana. 24. Review: IBM BPM - the best commercial BPM software package [online]. USA: TrustRadius, 2013 [visited on 2016-11-18]. Available from: https : / / www . trustradius . com / reviews / ibm - business - process-manager-2013-07-15-11-47-59. 25. IBM ECM Review: Everything is a Case. Use IBM Case Manager [online]. USA: TrustRadius, 2014 [visited on 2016-11-18]. Available from: https : / / www . trustradius . com / reviews / ibm - enterprise - content-management-2014-09-26-07-55-01. 26. Business Process Management Suite - Features & Benefits | Oracle [online]. USA: Oracle, 2016 [visited on 2016-11-19]. Available from: http: / / www . oracle . com / us / technologies / bpm / suite / features / index.html. 27. Oracle BPM Suite Reviews & Ratings | TrustRadius [online]. USA: TrustRadius, 2016 [visited on 2016-11-19]. Available from: https: //www.trustradius.com/products/oracle-bpm-suite/reviews. 28. Workflow Automation with Java and BPMN 2.0 [online]. USA: Camunda, 2016 [visited on 2016-11-18]. Available from: https://camunda. com/about/about-us/. 29. Workflow Automation with Java and BPMN 2.0 [online]. USA: Camunda, 2016 [visited on 2016-11-18]. Available from: https://camunda. com/bpm/enterprise/. 30. OptaPlanner - Documentation [online]. USA: KIE Group, 2016 [visited on 2016-11-14]. Available from: http://www.optaplanner.org/ learn/documentation.html. 31. KORTE, Bernhard; VYGEN, Jens. Combinatorial Optimization: Theory and Algorithms. USA: Springer, 2008. ISBN 3642090923.

70 BIBLIOGRAPHY

32. SPALL, James C. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control. Wiley-Interscience, 2003. ISBN 0471330523. 33. BLUM, C.; ROLI, A. Metaheuristics in combinatorial optimiza- tion: Overview and conceptual comparison [online] [visited on 2016-10-12]. Available from: http://dl.acm.org/citation.cfm? id=937505. 34. LocalSolver : Home [online]. France: Innovation 24, 2016 [visited on 2016-11-25]. Available from: http://www.localsolver.com/home. html. 35. Gurobi Optimization - The Best Mathematical Programming Solver [on- line]. USA: Gurobi, 2016 [visited on 2016-11-25]. Available from: http://www.gurobi.com/index. 36. IBM ILOG CPLEX Optimization Studio [online]. USA: IBM, 2016 [vis- ited on 2016-11-25]. Available from: http://www- 03.ibm.com/ software/products/en/ibmilogcpleoptistud. 37. Google Optimization Tools [online]. USA: Google, 2016 [visited on 2016-11-25]. Available from: https://developers.google.com/ optimization/. 38. Choco solver, a CP library [online]. USA: Choco, 2016 [visited on 2016-11-25]. Available from: http://www.choco-solver.org. 39. Computational Infrastructure for Operations Research [online]. USA: COIN-OR, 2016 [visited on 2016-11-25]. Available from: https : //www.coin-or.org. 40. bitronix/btm: JTA Transaction Manager [online]. USA: Bitronix, 2016 [visited on 2016-11-27]. Available from: https : / / github . com / bitronix/btm. 41. Hibernate. Everything data. [online]. USA: Red Hat, Inc., 2016 [visited on 2016-11-27]. Available from: http://hibernate.org. 42. JavaFX Developer Home [online]. USA: Oracle, 2016 [visited on 2016-11-29]. Available from: http : / / www . oracle . com / technetwork/java/javase/overview/javafx-overview-2158620. html.

71 BIBLIOGRAPHY

43. Lesson: Getting Started with Swing [online]. USA: Oracle, 2015 [visited on 2016-11-29]. Available from: http://docs.oracle.com/javase/ tutorial/uiswing/start/. 44. MVC Architecture [online]. USA: Google, 2016 [visited on 2016-11-29]. Available from: https : / / developer . chrome . com / apps / app _ frameworks#mvc. 45. XStream - About XStream [online]. USA: Xstream, 2016 [visited on 2016-12-20]. Available from: http://x-stream.github.io/index. html. 46. Java Architecture for XML Binding (JAXB) [online]. USA: Oracle, 2003 [visited on 2016-12-20]. Available from: http://www.oracle.com/ technetwork/articles/javase/index-140168.html.

72