The Pennsylvania State University

The Graduate School

College of Information Sciences and Technology

MANAGING AND LEVERAGING ACTION KNOWLEDGE:

THE CASE OF FRONT-LINE OPERATORS

IN THE PETROCHEMICAL INDUSTRY

A Dissertation in

Information Sciences and Technology

by

Jingwen He

 2015 Jingwen He

Submitted in Partial Fulfillment

of the Requirements for the Degree of

Doctor of Philosophy

May 2015

The dissertation of Jingwen He was reviewed and approved* by the following:

Sandeep Purao Professor of Information Sciences and Technology Dissertation Advisor Chair of Committee

Eileen M. Trauth Professor of Information Sciences and Technology

Heng Xu Associate Professor of Information Sciences and Technology

Timothy W. Simpson Professor of Mechanical Professor of

Carleen Maitland Interim Associate Dean of Undergraduate and Graduate Education

*Signatures are on file in the Graduate School

iii ABSTRACT

In the petrochemical industry, improvements in the technology and in production processes over the last few decades have led to a reduction in accidents. However, the resulting processes have also created the need for large numbers of highly skilled operators. The anticipated wave of retirement of operators in the next few years means that the industry will experience a significant loss of senior expert operators. Establishing ways to effectively leverage and manage knowledge about refinery operations is, therefore, a critical concern. This research conceptualizes knowledge that operators in the petrochemical industry possess as “action knowledge,” that is, as a basis for action—both as tacitly stored in operators’ heads and explicitly written as codified procedures; and examines the problem as well as proposes solutions following this conceptualization. The research reported in this dissertation is organized in three essays.

Essay 1 develops and describes an innovative approach and a software tool for analyzing operator procedures into chunks of explicit action knowledge. This approach uses science as the research method. Essay 2 evaluates the feasibility and effectiveness of the proposed approach based on an empirical evaluation of the software tool. Essay 2 uses empirical analyses based on use of authentic procedures obtained from multiple refineries, and feedback from expert operators. Essay 3 develops a new framework to understand tacit action knowledge in the context of operators working in the petrochemical industry. Essay 3 uses a modified iterative grounded research methodology to analyze work practice descriptions gathered from operators following the critical incident technique. The dissertation concludes with a discussion of findings across the three essays for managing action knowledge in the petrochemical industry, and contributions to the stream of research on knowledge along with directions for future work.

iv

TABLE OF CONTENTS

List of Figures ...... vii

List of Tables ...... x

Acknowledgements ...... xiii

Chapter 1 Introduction and Motivation ...... 1

1.1 Managing Operational Knowledge in Process Industries ...... 1 1.2 Research Setting: Petrochemical Industry...... 2 1.3 Research Conceptualizations ...... 4 1.4 Research Questions ...... 5 1.5 Structure of this Thesis...... 6

Chapter 2 Literature Review ...... 8

2.1 Knowledge ...... 8 2.1.1 Definitions and Classifications of Knowledge ...... 9 2.1.2 Procedural (Action) and Declarative Knowledge...... 11 2.1.3 Explicit Knowledge and Tacit Knowledge ...... 16 2.1.4 Individual Knowledge and Organizational Knowledge ...... 18 2.1.5 Connecting Knowledge Classifications to the Petrochemical Industry...... 21 2.2 Knowledge Management ...... 24 2.2.1 Definition and Importance of Knowledge Management ...... 25 2.2.2 Knowledge Management Strategies and Technologies ...... 27 2.2.3 Managing (Procedural) Action Knowledge ...... 31 2.2.4 Managing Explicit Knowledge and Tacit Knowledge ...... 34 2.2.5 Managing Individual Knowledge and Organizational Knowledge ...... 36 2.3 Conclusion ...... 40

Chapter 3 Research Questions ...... 42

3.1 Key Conceptualizations ...... 42 3.2 Research Scope ...... 44 3.3 Research Questions ...... 46

Chapter 4 Research Methodology ...... 49

4.1 Research Methodology for Essay 1 ...... 49 4.2 Research Methodology for Essay 2 ...... 51 4.3 Research Methodology for Essay 3 ...... 53

Chapter 5 Essay 1 – Semantic Procedure Analyzer: Extracting and Chunking Action Knowledge from Operator Procedures ...... 56 v Chapter Organization ...... 56 5.0 Précis ...... 57 5.1 Introduction and Motivation ...... 58 5.2 Context and Prior Research ...... 61 5.2.1 Procedures in the Petrochemical Refining Industry ...... 61 5.2.2 Knowledge and Knowledge Management ...... 63 5.2.3 An Action View of Knowledge ...... 65 5.2.4 Chunking as a Strategy for Managing Action Knowledge ...... 70 5.3 Research Method and ...... 71 5.3.1 Selection of the Research Method ...... 72 5.3.2 Iterative Approach ...... 73 5.3.3 Iteration 1 – Multi-Phase Approach ...... 76 5.3.4 Iteration 2 – Incorporating Learning ...... 77 5.3.5 Iteration 3 – Features of the Training Data and the Visual Display of Output ...... 78 5.4 A for Extracting and Chunking Action Knowledge ...... 80 5.4.1 Purpose and Scope – Extracting and Chunking Action Knowledge ...... 81 5.4.2 Justificatory Knowledge ...... 82 5.4.3 Solution Suggestion – A Heuristic Approach to Extraction and Chunking .... 84 5.4.4 Approach Mutability – Learning Mechanisms ...... 89 5.4.5 Expository Instantiation ...... 94 5.5 Conclusions ...... 96

Chapter 6 Essay 2 – A Evaluation of the Semantic Procedure Analyzer ...... 100

Chapter Organization ...... 100 6.0 Précis ...... 100 6.1 Introduction and Motivation ...... 101 6.2 Evaluation Design ...... 105 6.3 Operating Procedures as Inputs to Evaluation ...... 109 6.4 Descriptive ...... 111 6.5 Evaluation by a Panel of Experts ...... 115 6.5.1 Evaluation Process with the Expert Panel ...... 116 6.5.2 Results from the Expert Panel Evaluation ...... 116 6.6 Evaluating the Learning Mechanisms in Phase 2 ...... 119 6.6.1 Evaluation Process ...... 119 6.6.2 Analysis of Results ...... 123 6.6.3 Error Analysis ...... 129 6.7 Evaluation of Chunking Outcomes Suggested by the Heuristics in Phase 3 ...... 136 6.7.1 Evaluation Process ...... 137 6.7.2 Analysis of Errors ...... 140 6.7.3 Comparing Naïve-approach Chunks and Heuristic-suggested Chunks ...... 144 6.8 Discussions and Conclusions ...... 149

Chapter 7 Essay 3 – Operator Strategies to Use Action Knowledge in Support of Tasks ...... 151

Chapter Organization ...... 151 7.0 Précis ...... 151 7.1 Introduction and Motivation ...... 152 vi 7.2 Research Setting ...... 154 7.3 Prior Research ...... 157 7.3.1 Action Knowledge ...... 158 7.3.2 Operator Behaviors and Human Errors ...... 159 7.4 Research Methodology...... 163 7.4.1 Rationale ...... 163 7.4.2 Approaches to Conduct Grounded Theory Development ...... 163 7.5 Research Process ...... 165 7.5.1 Research design ...... 165 7.5.2 Data collection...... 168 7.5.3 Data analysis ...... 172 7.5.5 Iterations to generate theory ...... 181 7.6 Discussion of Findings ...... 195 7.6.1 Summary of Findings ...... 195 7.6.3 Reflections on the grounded theory method ...... 199 7.7 Conclusions ...... 201

Chapter 8 Concluding Remarks ...... 203

8.1 Returning to the Research Question(s) ...... 203 8.2 Contributions ...... 205 8.3 Limitations ...... 207 8.4 Implications for Future Research ...... 209 8.5 Implications for Use in Practice ...... 211 8.5.1 For Petrochemical Industry ...... 211 8.5.2 For Other Process Industries ...... 214

Reference ...... 215

Appendix A Notations of Procedure Chunking Heuristics ...... 227

Appendix B SPA User Interface ...... 228

Appendix C Recruitment Script ...... 235

Appendix D Interview Protocol ...... 236

Appendix E Example of Interview Transcript Fragment ...... 239

Appendix F Core Categories and Sub-Categories of Strategy to Apply Action Knowledge ...... 242

vii LIST OF FIGURES

Figure 0-1. U.S. job-related nonfatal injuries and illnesses: 1998–2007 (per 100 full-time workers) (Wolf, 2001)...... 3

Figure 1-2. Immediate causes of accidents in the petrochemical industry for the period 1985–2002 (Nivolianitou et al., 2006)...... 4

Figure 2-1. SECI model of the knowledge conversion (Nonaka, 1994)...... 18

Figure 2-2. Relationship between individual and organizational knowledge (Bhatt, 2002). ... 20

Figure 2-3. Applying knowledge classifications in the petrochemical industry domain...... 23

Figure 2-4. Understanding knowledge management processes from organizational strategy and information technology perspectives...... 28

Figure 3-1. Research scope...... 45

Figure 3-2. Three essays in this research study...... 48

Figure 3-1. Research scope...... 50

Figure 5-1. Research objective for Essay 1...... 58

Figure 5-2. An excerpt from a procedure at a refinery...... 62

Figure 5-3. Precedence matrix...... 75

Figure 5-4. SPA structure in Iteration 1...... 76

Figure 5-5. SPA structure in Iteration 2 (new or changed phases shaded)...... 78

Figure 5-6. SPA for collecting background information and selecting a training plan...... 79

Figure 5-7. Network display of action knowledge chunks with SPA...... 79

Figure 5-8. A heuristic approach to extracting action knowledge from procedures and chunking it...... 85

Figure 5-9. Network display of action knowledge chunks with SPA...... 89

Figure 5-10. One life cycle of the learning process for Phase Two...... 91

Figure 5-11. Learning mechanism for Phase 2...... 92

Figure 5-12. Scope of learning...... 94

Figure 5-13. of SPA...... 95 viii Figure 5-14. Interface of SPA in the final phrase...... 95

Figure 6-1. The three-phase approach to extracting and chunking action knowledge...... 103

Figure 6-2. The SPA Interface in three phases...... 104

Figure 6-3. Evaluation efforts for the pre-designed action knowledge chunking approach. ... 108

Figure 6-4. Screenshot of an example procedure provided by an industry partner...... 110

Figure 6-5. Evaluation a procedure of SPA with learning mechanisms...... 123

Figure 6-6A. A comparison of the accuracy rate of SPA with learning mechanisms and SPA without learning mechanisms for training data...... 127

Figure 6-6B. A comparison of the accuracy rate of SPA with learning mechanisms and SPA without learning mechanisms for testing data...... 128

Figure 6-7A. A comparison of the parsing error rates of SPA with learning mechanisms and SPA without learning mechanisms for training data...... 132

Figure 6-7B. A comparison of the parsing error rates of SPA with learning mechanisms and SPA without learning mechanisms for testing data...... 133

Figure 6-8A. A comparison of tagging errors of SPA with learning mechanisms and SPA without learning mechanisms for training data...... 135

Figure 6-8B. A comparison of tagging errors of SPA with learning mechanisms and SPA without learning mechanisms for testing data...... 135

Figure 7-1. Research objective for Essay 3...... 152

Figure 7-2. Field and console operators in the refineries...... 155

Figure 7-3. The setting for the study: a petrochemical refinery...... 156

Figure 7-4. Research gaps in prior studies...... 162

Figure 7-5. Data collection and analysis process...... 167

Figure 7-6. Development of core categories of strategy...... 179

Figure 7-7. Analyzing data with Atlas.ti qualitative data analysis software package...... 180

Figure 7-8. Composition of core categories of strategy for the use action knowledge...... 185

Figure 7-9. Individual frequency of procedure-driven strategy and domain-knowledge- driven strategy...... 188

Figure 7-10. Use of different strategies by novice vs. expert operators...... 190 ix Figure 7-11. Boxplot of Procedure-driven strategy and human-resource-driven strategy between novice and expert operators...... 192

Figure 7-12. Summary of findings to address research gaps...... 197 x LIST OF TABLES

Table 2-1. Levels and definitions of data, information, knowledge, and wisdom (Ackoff, 1989)...... 9

Table 2-2. Multiple knowledge classification approaches in the literature...... 9

Table 2-3. Action knowledge: Interpretations and attributes...... 14

Table 2-4. Attributes of action knowledge...... 15

Table 2-5. Definitions of knowledge management...... 27

Table 3-1. Research sub-questions...... 47

Table 4-1. Research methodology for each essay...... 54

Table 5-1. A selective summary of classifications of knowledge proposed in prior research...... 64

Table 5-2. Action knowledge interpretations and attributes...... 68

Table 5-3. A model of action knowledge derived for this research ...... 69

Table 5-4. Precedence table of variables...... 74

Table 5-5. Heuristics for Phase One: Spurious Content Removal...... 86

Table 5-6. Heuristics for Phase Two: Instruction Pre-processing...... 87

Table 5-7. Heuristics for Phase Three: Chunk Extraction...... 88

Table 5-8. Example of a Tagging Analysis of the Term “Shutdown.” ...... 93

Table 5-9. Design principles of the developed design theory...... 96

Table 6-1. Evaluation objectives and approaches...... 107

Table 6-2. Operating procedures obtained from petrochemical plants...... 109

Table 6-3. Lightweight ontology and example elements...... 111

Table 6-4. An illustration showing the application of Phase 1 heuristics...... 112

Table 6-5. An illustration showing the application of Phase 2 heuristics...... 113

Table 6-6. Chunks extracted from a procedure...... 114

Table 6-7. Background of expert operators...... 115 xi Table 6-8. Results of expert assessment...... 116

Table 6-9. Results of expert assessment...... 118

Table 6-10A. Accuracy obtained with SPA without learning mechanisms (50 procedures)...... 126

Table 6-10B. Accuracy obtained with SPA with learning mechanisms (50 procedures)...... 126

Table 6-10C. Accuracy obtained with SPA without learning mechanisms (85 procedures). .. 126

Table 6-10D. Accuracy obtained with SPA with learning mechanisms (85 procedures)...... 127

Table 6-11A. Parsing errors generated by SPA without learning mechanisms (50 procedures)...... 131

Table 6-11B. Parsing errors generated by SPA with learning mechanisms (50 procedures)...... 131

Table 6-11C. Parsing errors generated by SPA without learning mechanisms (35 procedures)...... 131

Table 6-11D. Parsing errors generated by SPA with learning mechanisms (35 procedures)...... 132

Table 6-12A. Tagging errors generated by SPA without learning mechanisms (50 procedures)...... 133

Table 6-12B. Tagging errors generated by SPA with learning mechanisms (50 procedures)...... 134

Table 6-12C. Tagging errors generated by SPA without learning mechanisms (35 procedures)...... 134

Table 6-12D. Tagging errors generated by SPA with learning mechanisms (35 procedures)...... 134

Table 6-13. Outcomes suggested by SPA with chunking heuristics and by SPA with the naïve approach...... 141

Table 6-14. Acceptance rates of chunks suggested by SPA with chunking heuristics and by SPA using the naïve approach...... 142

Table 6-15. Outcomes suggested by SPA with chunking heuristics...... 143

Table 6-16. Example of chunking procedure by SPA with chunking heuristics and by SPA with the naïve approach...... 145

Table 6-17. Outcome of example procedure...... 147 xii Table 6-18. Acceptance rates of chunks extracted from the example procedure...... 147

Table 6-19. Acceptance rate of each chunking heuristic for the example procedure...... 148

Table 7-1. Two types of grounded theory...... 164

Table 7-2. List of organizations...... 168

Table 7-3. List of interviewees...... 169

Table 7-4. Data collection...... 171

Table 7-5. Partial list of labels created during open coding...... 174

Table 7-6. Three iterations and theory emergence across study...... 182

Table 7-7. Core categories of strategy coded from interview data...... 186

Table 7-8. Example of core categories of strategy for use of action knowledge...... 187

Table 7-9. T-test results and explanations...... 191

Table 7-10. Impact of operator experience on adoption of strategy...... 195

Table 7-11. Constructs and action knowledge for the theory of action strategy...... 198

xiii ACKNOWLEDGEMENTS

First and foremost, I would like to express my special appreciation and thanks to my advisor Dr. Sandeep Purao, you have been a tremendous mentor for me. I would like to thank you for encouraging my research and for allowing me to grow as a researcher. Your advice on both research as well as on my career has been priceless. I would also like to thank my committee members, Dr. Eileen Trauth, Dr. Heng Xu, and Dr. Timothy Simpson for serving as my committee members. I also want to thank you for your comments and suggestions throughout the process.

I also would like to thank my project external guide Mr. David Strobhar from Beville

Engineering, Inc. and all the people who provided me with the facilities being required and conductive conditions for my dissertation project. This project would have been impossible without the support of the Center for Operator Performance.

A special thanks to my friends and family. I would like to thank all of my friends who supported me in writing, and encouraged me to strive towards my goal. Words cannot express how grateful I am to my mother-in law, father-in-law, my mother, and father for all of the sacrifices that you’ve made on my behalf. And my daughter Katherine, you brought me a lot of enjoyable moments in the past one year. At the end, I would also like to express appreciation to my beloved husband, Siyuan. None of this could have happened without your encouragement and support every day over the past six years. 1

Chapter 1

Introduction and Motivation

1.1 Managing Operational Knowledge in Process Industries

Process industries are characterized by operations that are run either as continuous or batch processes involving material transformations. Examples of process industries include food, beverages, chemicals, petrochemicals, coal, and paper products, among others. Firms in process industries are concerned with using formulas and recipes to processing resources in order to produce other products (Makins, 1991). In process industries, the relevant factors are ingredients, not parts; formulas, not bills of materials; and overall operation processes, not individual units.1

An output produced through such a process has produced cannot be returned to its original components.

In process industries, human operators play a critical role in monitoring operations, responding to emergencies and failures, optimizing processes and recovery, and coordinating maintenance and repair tasks (Schragenheim, Cox, & Ronen, 1994). Operators perform these tasks around the clock every day and coordinate their work across multiple shifts (Isermann,

2006). During their tenure in the process industry, they acquire and cultivate expertise that is difficult that newcomers simply do not bring to the industry (Strahan, 2005). Without this expertise, operations can suffer, and processes may be derailed. The result: accidents that can cause significant damage to assets, and worse, to injuries to operators and loss of life. In recent years, a number of technological advances have been made and automation systems introduced that have significantly changed the role of the operator (Ernst & Lundvall, 2004). These high-tech

1 http://en.wikipedia.org/wiki/Process_manufacturing

2 production processes have not replaced the operators. Instead, they have changed the operator roles significantly. They have forced the operators to acquire more sophisticated knowledge and skills. And they have prompted firms in process industries to document and codify practices in an effort to render the operators less error-prone and more effective (Grote, 2008; Rowe, 2009). This

“operational,” knowledge thus, refers to the knowledge and skills that these operators (should) possess and draw on in order to ensure that the plants in a process industry are run effectively and without any serious problems.

1.2 Research Setting: Petrochemical Industry

The petrochemical industry is the focal industry for the present research. In this industry, operations are focused on the continuous processing, including extraction and refinement, of crude oil. The scale of the problem – how to manage operational knowledge – can be appreciated by considering the number of large refineries. Within the continental U.S., there are about 140 refineries, each with about 250 operators.

I selected the petrochemical industry as our research setting for two reasons. First, failures in this industry can result in catastrophic consequences with loss to property and life, which make the problem important. Over the past few decades, although both technology and government regulations have been developed to improve safety in this industry, major industrial accidents in this context may be as likely today as they were 10 years ago. For example, since

1998, neither the U.S. nor the European industry has shown a reduction in either the fatality rate or the major accident rate (Wolf, 2001) (see Figure 1-1). Beyond the fatality and non-fatality rates and the direct financial loss, the most dangerous potential consequences of petrochemical industry accidents are environment pollution and harm to members of the general public, such as people living close to a petrochemical facility. As the industry grows, the number and size of the

3 facilities including the toxicity of the materials and the number of people living or traveling nearby may also increase (Perrow, 1999).

6

5

4 Petroleum Refining 3 Exploration and Production Retail 2 Wholesale Marketing 1

0 1996 1998 2000 2002 2004 2006 2008

Figure 0-1. U.S. job-related nonfatal injuries and illnesses: 1998–2007 (per 100 full-time workers) (Wolf, 2001).

Second, human resource management is a big concern in the petrochemical industry. On one side, operator error is still high in the petrochemical industry. Over the last thirty-years, operator error has been the second largest cause of loss in the industry (Nivolianitou,

Konstandinidou, & Michalis, 2006) (see Figure 1-2). On the other side,, the high rate of operator turnover becomes a critical issue for petrochemical refineries at the present time. Even worse, the number of operators in the petrochemical industry is predicted to decrease with a significant wave of retirements in the next few years, resulting in the loss of 20% of the operators, effectively clearing the ranks of expert operators.

4

Figure 1-2. Immediate causes of accidents in the petrochemical industry for the period 1985–2002 (Nivolianitou et al., 2006).

Establishing an effective way to manage operational knowledge in the petrochemical industry is, therefore, an extremely urgent concern for two reasons: (1) the industry’s accident rate is high and accidents have serious consequences, and (2) the number of experienced operators capable of avoiding or reducing the impact of accidents is decreasing. It is for these reasons that my selected the petrochemical industry as the research setting of this study.

1.3 Research Conceptualizations

Given the characteristics of the research setting, I adopted three key conceptualizations to drive my investigation. The first relies on the fundamental conceptualization of knowledge as related to operator actions. In prior work, knowledge is considered to be a justified true belief that answers question of “what” (Lehrer & Paxson, 1969). Instead of this traditional conceptualization, our conceptualization emphasizes that operational knowledge is important insofar as it supports action, namely, the knowledge to answer the “how” question. I follow the recent work of Bera and Wand to define action knowledge as “knowledge that supports effective action” (Bera & Wand, 2009). The conceptualization ‘action knowledge’ corresponds to the idea

5 of operational knowledge, and may be a collection of rules, prescriptions, and propositions that give operators the ability to choose and execute a course of actions to achieve their goals.

Therefore action knowledge constitutes the first research conceptualization of this study.

The second conceptualization differentiates between tacit and explicit action knowledge.

This conceptualization concerns how action knowledge is stored and expressed. For example, action knowledge in the petrochemical industry could be stored in the operator’s mind. As both technologies and government regulations have developed, more and more action knowledge has been written down in the form of codified procedures. These two forms are recognized by the second conceptualization and, in turn, map onto the classic distinction between explicit and tacit knowledge proposed by Nonaka and Konno (1999).

The third conceptualization focuses on the possessor of action knowledge. In the petrochemical industry, action knowledge could be possessed by individual operators in the form of their own experience and expertise. It could also be codified and managed by individual refineries or by the industry as a whole. This difference is consistent with the classification of knowledge as either individual and organizational in nature (Barney, 1986; Bhatt, 2002), and provides the possibility for analyzing how such knowledge can be managed at different levels of analyses.

1.4 Research Questions

Research concerns pertaining to action knowledge in process industries as well as the selection of the petrochemical industry as the research setting suggests the research question of this study: How should action knowledge be leveraged and managed in process industries? In this study I select the research domain as the petrochemical industry, and investigate three related sub-questions in order to answer our overall research question: (1) How can action knowledge,

6 codified in operational procedures, be extracted and chunked so that it may be managed as an organizational asset? (2) Is the chunking process designed for explicit action knowledge feasible and effective in the petrochemical industry? And (3) What strategies do operators use to apply action knowledge to perform tasks in the petrochemical industry? Together, these three questions are aimed at addressing the concerns outlined above about managing action knowledge in the process industry; and to contribute to the research stream related to knowledge management.

Each of these sub-questions is investigated via a research essay.

1.5 Structure of this Thesis

Essay 1 (in Chapter 5) deals with explicit action knowledge, which is action knowledge codified as standard operating procedures. The essay describes the design and implementation of a tool called the SPA (Semantic Procedure Analyzer), an approach to converting action knowledge codified in procedures into knowledge chunks that can be made available to operators at the point of action and managed by the organization as a knowledge asset. Essay 2 (in Chapter

6) evaluates the implemented tool with a set of authentic procedures obtained from multiple companies in the petrochemical industry. The evaluation is supplemented with expert feedback and assessment. Both the evaluation data pertaining to parsing accuracy and to user acceptance are analyzed. Together, these two essays follow the design science research methodology. Essay

3 (in Chapter 7) explores application of action knowledge, which including both tacit and explicit one. The essay offers a framework for understanding strategies to apply action knowledge with a particular emphasis on several categories of resources of action knowledge. It also compares the use of strategies by novice and expert operators, in order to understand the relationship between operator experience and use of strategies to apply action knowledge. Together, the three essays

7 address the overall research question: How should action knowledge be leveraged and managed in the petrochemical industry?

This dissertation comprises eight chapters. Chapter 2 introduces a review of related research and Chapter 3 generates the research scope of this study. Chapter 4 describes the research methodologies adopted in this study. Chapters 5, 6, and 7 describe the three essays respectively. The final chapter offers a conclusion, describes the contributions of the present study, and suggests directions for future research.

Chapter 2

Literature Review

In this chapter, I present a literature review beginning with definitions and classifications of knowledge, before moving on to defining and investigating types of knowledge management strategies, with a focus on how these strategies might connect to the current research questions.

The literature review is driven by the research concern of understanding how knowledge management relates to operators in the petrochemical industry. The purpose of this review is twofold: first, to understand current state of the research related to the research question; and second, to identify key concepts that can be adapted as foundations for the current research.

2.1 Knowledge

This literature review begins with the concept of “knowledge,” as it refers to the concepts, principles, and information regarding “learning operations” (books, media, training classes, operator experience, and other sources) in the petrochemical industry. In this section, the difference between knowledge and other related concepts, such as data, information, and wisdom are clarified, after which several perspectives on knowledge and its classifications are summarized. Among these perspectives, “action,” “tacit vs. explicit,” and “individual vs. organizational” views of knowledge are appropriate to the research domain of petrochemical industry and are discussed in detail. A summary of action knowledge attributes including examples is given in this subsection, including Nonaka’s SECI model in a discussion of tacit vs. explicit knowledge and their knowledge conversion processes of Socialization, Externalization,

9 Combination, and Internalization (Nonaka, Byosiere, Borucki, & Konno, 1994), as well as knowledge at both the individual and the organizational as discussed by Bhatt (2002).

2.1.1 Definitions and Classifications of Knowledge

An easy starting point is the concept of knowledge and its position in the widely recognized hierarchy of data, information, knowledge, and wisdom (Ackoff, 1989). Commonly accepted definitions are suggested in Table 2-1.

Table 2-1. Levels and definitions of data, information, knowledge, and wisdom (Ackoff, 1989). Concept Definition Data Unprocessed facts and figures without interpretation or analysis Information Data given meaning that benefits the user Knowledge Combination of information, experience, and insight that benefits the user Wisdom Extrapolative and non-deterministic extension of knowledge

The concept of knowledge is the focus of the present research. A multi-faceted concept with multi-dimensional meanings, knowledge has a history in the field of epistemology that can be traced back to Aristotle and Plato (Jori, 2003). Analytically minded philosophers mostly agreed that knowledge could be understood as “justified true belief” (Chisholm, 1982). This traditional definition of knowledge has been extended by later researchers, who considered the resources needed to generate knowledge, the context in which it is applied, and the purpose for using it (Cook & Brown, 1999; Schultze & Leidner, 2002; Simpson & Weiner, 1989). Depending on context, researchers have classified knowledge in a number of different ways. Table 2-2 shows a summary of these categories.

Table 2-2. Multiple knowledge classification approaches in the literature.

10

Classification Definition Approach Procedural vs. Procedural knowledge is “know-how” knowledge expressed in expert Declarative systems with rules, or (in organizational life) with procedures. (Anderson, 1976, Declarative knowledge refers to more descriptive “know-what” 2009) knowledge represented by objects or agents in new programming languages. Tacit vs. Explicit Tacit knowledge is the comprehensive cognizance of the human mind (Hasher & Zacks, and personal judgments rooted in thoughts and actions that are hard to 1984) formalize and communicate. Explicit knowledge is knowledge that has been or can be externalized by articulating, coding, and communicating using some symbolic system. Individual vs. Individual knowledge pertains to things an individual can know, learn, Organizational and express. (Barney, 1986; Organizational knowledge is linked to individual knowledge integrated Bhatt, 2002) and shaped by organizational history and culture and possessed by an organization. Deep vs. Surface Deep knowledge refers to models and causal explanations that go back (Webb, 1997) to natural laws (commonly present in Artificial Intelligence). Surface knowledge is represented by practical rules that can be acquired from people. Internal vs. External Internal knowledge is knowledge adapted to the specific needs of an (Kucza, 2001) organization. External knowledge functions on a more general level and must be adapted before it can be utilized by an organization. General vs. Domain General knowledge is knowledge that is independent of both domain (Gao & Sterling, and site. 2001) Domain knowledge is knowledge that is true in a particular domain but that may not be generalizable to other domains. Scientific vs. Social Scientific knowledge is knowledge about the hierarchy and unity of the (Hoon & Derick, universe. 1994; Osman, 1998) Societal knowledge is knowledge that enables efforts to understand and predict general patterns of behavior on the part of others.

Among the various knowledge classification approaches, three are relevant to the present research topic: Procedural vs. Declarative Knowledge, Explicit vs. Tacit Knowledge, and

Individual vs. Organizational Knowledge. These three classifications of knowledge are discussed next, including in regard to their relevance to the present research.

11 2.1.2 Procedural (Action) and Declarative Knowledge

The standard operating procedures used in most refineries today reflect the results of many years of work in the industry and are meant to provide oversight and control of operator performance (Jamieson, 2002). Despite their varying formats, operating procedures are all fundamentally sets of instructions that operators are supposed to follow (Nivolianitou et al.,

2006). They represent knowledge to support operator action. By focusing on procedures as the means to execute operations, I could understand knowledge from the perspective of action.

This “action” view of knowledge has its roots in Artificial Intelligence (AI) research

(Newell, 1982; Newell & Simon, 1976), which provides an important precursor to this work

(Blosch, 2001), and suggests an emphasis different from the declarative knowledge2 of truth- value regarding a particular subject or subjects. “Procedural knowledge” is a collection of rules, prescriptions, and propositions whereby users have the ability to choose and execute a course of action. Procedural knowledge is sometimes referred to as “production knowledge” or “action knowledge” (Georgeff & Lansky, 1986). This classification of knowledge emphasizes the difference between declarative knowledge (information about “why or what”) and procedural knowledge (information about “how and when”) (Clark & Estes, 1996).

Although the terms procedural knowledge and action knowledge are used interchangeably, there are important differences between them. The concept of procedural knowledge considers the expression of “know-how” knowledge by emphasizing its format as procedures or condition-action rules (Anderson, 1976, 2009; Corbett & Anderson, 1994). This definition makes “know-how” a sub-set of explicit knowledge. However, this definition is contradicted by Nickols’s integration of explicit vs. tacit knowledge with declarative vs. procedural knowledge, which considers that all “know-how” is tacit and all “know-what” is

2 Declarative knowledge: “know-what” knowledge represented by objects or agents in new programming languages.

12 explicit (Nickols, 2000). Neither Anderson’s view (of connecting “know-how” knowledge with the classification of tacit and explicit knowledge) nor Nickols’s definition is used for the purposes of this research. “Know-how” knowledge can be represented and codified as condition-action rules. In a significant proportion of AI research, condition-action rules (explicit knowledge) are used to govern the actions of robots (Georgeff & Lansky, 1986). On the other hand, “know-how” knowledge may also be embedded in an individual’s practices as a routine or ability without articulation, such as the ability to ride a bike (tacit knowledge) (Eraut, 2000).

Aside from the formal definition of procedural knowledge, in the domain of Information

Systems (IS) research, interpretations of action knowledge emphasize the role of supporting practice without considering the explicit form of expression of that knowledge. Bera and Wand argue that rather than “justified true belief,” effective performance provides the basis for knowledge (Bera & Wand, 2009). Blosch (2001) suggests a complementary view, emphasizing the role of action knowledge in ensuring successful practice. Drawing on the discussion above, in this research, I use the term action knowledge to refer to “know-how” knowledge that enables an individual to select and perform action(s) that change the current state to a desired target state

(Bera & Wand, 2009). The use of the concept of action knowledge (as opposed to procedural knowledge) can help the reader avoid misunderstanding “know-how” knowledge as either explicit or tacit.

In order to interpret how to take a particular action, more information (such as the goal of the action, possible outcomes, or underlying rationale) is necessary to provide the related action knowledge. This information is abstracted here as four attributes of action knowledge: action, actor, environment, and goal. Action knowledge has been interpreted in a range of ways in the existing literature in reference to the action view of knowledge. In addition to differing definitions and interpretations of action knowledge, different attributes have also been ascribed in research conducted on the expression of action knowledge. This inconsistency in action

13 knowledge–related studies is the result of two conditions. First, frameworks and models in different studies ascribe different sets of attributes to action knowledge. Second, even when focusing on one attribute, the definition of that particular attribute may vary, leading to difficulty in establishing a consistent understanding of action knowledge across large volumes of literature.

A summary of interpretations of action knowledge and their corresponding attributes from previous literature is presented chronologically in Table 2-3.

14 Table 2-3. Action knowledge: Interpretations and attributes. Interpretation of Action Knowledge (AK) Attributes Source Action, Actor, AK is logic with the possible situation to take Environment (Actuality, (Moore, 1977) action. Precondition), Goal AK is a prerequisite for an action that can be Action, Actor, analyzed as a matter of knowing what action to (Moore, 1985) Environment, Goal take; an executable description. AK is knowledge that makes action-at-a- distance possible by reasoning about and Action, Actor, taking control over activity located at some Environment (Time, (Gasser, 1991) other place in or time (such as the Place, Resource) future). Action, Actor, AK enables an agent to achieve goals even if (Lesperance & Environment (Time), not all information is known. Levesque, 1995) Trigger AK is normative constraints, criteria by which Action, Actor, (Tsoukas, 1996) behavior may be guided and assessed. Environment (Location) AK is values of the agent’s state that allow executions to be activated, inhibited, or Action, Environment (Narayanan, modified. It can be used to plan and monitor an (Time, Location, 1997) action or to provide a real-time simulative Resource), Goal inference. AK is the agent’s belief at different times for Action, Environment (Geffner & determining the “executability” of a policy. (Resource, Time) Wainer, 1998) AK governs human actions (rules and/or Action, Environment (Goldkuhl, 1999) prescriptions for action). (Post Condition), Goal AK is knowledge that ensures the successful Action, Actor, (Blosch, 2001) completion of a task. Environment (Resource) AK precedes and, therefore, determines action and performance. It enables one to establish a Action, Actor, Goal (Chia, 2003) logically consistent pathway between the past and the future. AK is a model that describes how to impact the (Krogstie, Sindre, Action, Actor, knowledge interpreters and change the domain & Jorgensen, Environment, Goal as a facilitator. 2006) Action, Actor, (Hawthorne & AK is propositions as reasons for acting. Environment Stanley, 2008) AK is the agent’s ability to select an action Action, Actor, from those available given the current state of (Bera & Wand, Environment agent and the environment in order to change 2009) (Resource), Goal the current state in line with a given goal. AK enables the ability to select a course of Action, Actor, action that can lead to a change in the state of (Freund & Baltes, Environment (Resource, the environment, given the state of the agent 2000) Time), Goal, Outcome and of the environment.

15 This summary of the research on action knowledge and its attributes offers a basis for selecting an appropriate model for this research. The multiple definitions of action knowledge accommodate many situations in which following established practices and/or performing tasks is useful. In the present study, a simple framework consisting of the four attributes, Action, Actor,

Environment, and Goal, developed in prior research (see Table 2-4) is used. An Action causes changes in state or perception. The Actor is the person who possesses knowledge and performs actions. The Environment describes the situation or circumstance in which the action takes place.

And the goal is the target state that motivates the actor’s behavior. Three sub-attributes follow under the Environment heading: Time, Location, and Resources. Time is a temporal description at a specific stage of the action. Location describes the geographic state of an action. And,

Resources refers to physical materials produced or consumed during an action. The definitions given are those most applicable for the scope of research in this study.

Table 2-4. Attributes of action knowledge. Attribute Definition Resource Action Causes changes in state or perception (Moore, 1985) A person who possesses knowledge and performs (Hawthorne & Stanley, Actor actions 2008) The domain, or a set of all statements that can be (Krogstie et al., 2006) Environment made about the situation under consideration A temporal description of behavior: used to bind (Narayanan, 1997) Time action schemas at specific stages to describe behavior The geographic distribution of actors (Gasser, 1991; Location Narayanan, 1997) Physical materials produced or consumed during (Gasser, 1991; Kornfeld Resource an action that maintain the actor’s participation in a & Hewitt, 1981) course of action Goal The target state that motivates the actor’s behavior (Narayanan, 1997)

16 2.1.3 Explicit Knowledge and Tacit Knowledge

Following the action view of knowledge, the second relevant knowledge classification that I investigate in the literature review is Explicit vs. Tacit Knowledge. Knowledge created and used in the petrochemical industry can be either written as procedures or stored in the operator’s mind. These two forms of knowledge map onto Nonaka’s classification of knowledge as either explicit or tacit (Nonaka, 1994). Explicit knowledge is knowledge that has been or can be externalized by articulating, coding, and communicating it using a symbolic system (Hasher &

Zacks, 1984). Explicit knowledge can be expressed in formal and systematic language and shared in the form of data, scientific formulas, specifications, or manuals. It plays an increasingly large role in organizations and is considered the most important factor of production in the knowledge economy (Romer, 1995). Information presented in textbooks, manuals, and articles (such as this proposal) is an example of explicit knowledge.

Generally, people find that though they may be able to perform well, they are unable to articulate exactly what they know or how to put it into practice. Tacit knowledge is the category used to describe this kind of knowledge. Precisely the opposite of explicit knowledge, tacit knowledge is defined as knowledge that is difficult to transfer to another person in written or oral form (Hasher & Zacks, 1984); rather, it is developed from direct experiences, actions, procedures, routines, commitments, ideals, values, and emotions (Schon, 1987). The ability to ride a bicycle is an example often used to emphasize the importance of tacit knowledge (Dreyfus, Dreyfus, &

Zadeh, 1987). The explicit knowledge that to maintain balance when riding a bicycle requires steering to the opposite side if the bicycle begins to fall doesn’t really help when actually riding a bicycle. However, people who know how to ride a bicycle cannot say much more than this when describing how to keep their balance. However, although it is difficult to articulate, tacit knowledge can be shared through highly interactive conversations and story-telling under certain

17 conditions within organizations (Zack, 1999). One example of tacit knowledge in the petrochemical industry is that expert operators can estimate the temperature inside a refinery furnace by smelling the air nearby. This is extremely important knowledge, but it is difficult for operators to describe how they determine it.

The categories of explicit knowledge and tacit knowledge are not sharply divided or fixed. Instead, they are rather “Convertible.” The most well-known knowledge conversion model is the SECI model from Nonaka et al. (1994) in which four “modes” of knowledge conversion are postulated (see Figure 2-1). This first mode refers to the conversion of tacit knowledge through interactions that take place between individuals, a process called “Socialization.” The second mode involves the use of social processes to combine different bodies of explicit knowledge, a process known as “Combination.” The third mode is the conversion of explicit knowledge into tacit knowledge, which bears some similarity to the traditional notion of “learning,” a process known as “Internalization.” The last mode converts tacit knowledge into explicit knowledge and is called “Externalization.” The four modes of knowledge conversion occur successively during organizational operations, and new knowledge is created during this recursive process. This model demonstrates that explicit knowledge and tacit knowledge are neither static nor unchangeable. The processes whereby these forms of knowledge can be converted one into the other also exist in the domain of the petrochemical industry, and are investigated in this research.

18

Tacit Explicit Knowledge Knowledge

Tacit Socialization Externalization Knowledge

Explicit Knowledge Internalization Combination

Figure 2-1. SECI model of the knowledge conversion (Nonaka, 1994).

2.1.4 Individual Knowledge and Organizational Knowledge

The third knowledge classification introduced in detail is Individual vs. Organizational

Knowledge. It is appropriate to apply this classification in the petrochemical industry domain, because though knowledge is possessed by individual operators, it is also possessed and managed as an organizational asset. The definition of knowledge as a “combination of information, experience, and insight that benefits the user” (see Table 2-1) ties knowledge closely to the individual. Individual knowledge can be simply defined as knowledge possessed by the individual. It is largely “packaged” in the form of tacit knowledge or as explicit knowledge that has not yet been transferred (van Daal, Haas, & Weggeman, 1998). A person can draw on individual knowledge to perform specific tasks such as revaluing his/her experiences, broadening or deepening a repertoire of skills, adapting an attitude, or modifying an information base (van

Daal et al., 1998). According to this view, although an organization’s members possess and utilize individual knowledge in an organizational environment, the most important individual knowledge does not belong to the organization.

19 Given that an organization is a problem-facing and problem-solving entity, it also learns and acquires knowledge through its routines and repertoires, which are embedded in specific organizational history (Nelson & Winter, 1982). Inside an organization, knowledge of diverse repertoires or routines is integrated, and new knowledge is created and shaped by organizational history and culture (Barney, 1986). Organizational knowledge is the product of those routines

(Schulz & Jobe, 2001).

Some researchers have further divided organizational knowledge into multiple categories, including group knowledge and multi-organizational knowledge. Group knowledge is a dynamic whole based on interdependency within groups (Nonaka et al., 1994). It is defined as individual knowledge that multiple individuals rely on as truth and that is shared and understood by numbers of groups (Silva & Agustí-Cullell, 2003). Beyond groups within organizations, communities of practice that extend across organizational boundaries are linked; i.e., knowledge is stored in a community of practice across organizations (Cohen, 2006). This knowledge, transferred across organizational boundaries, is called multi-organizational knowledge (Montoni,

Miranda, Rocha, & Travassos, 2004). The one characteristic shared by these three types of knowledge is that they are aggregations of individual knowledge at different organizational levels. Because of this, these three types of aggregated individual knowledge will be referred to as organizational knowledge for the purposes of this research.

Another important issue that should be mentioned is the difference between knowledge at the individual level and knowledge at the organizational level. Although organizational knowledge is an aggregation of individual knowledge, it is not enough to simply consider it as a metaphor to denote the aggregate knowledge of an organization’s members. As organizations grow in size and life span, individual knowledge goes beyond the aggregation. As the organization grows, its knowledge base surpasses the knowledge of its current individual members to include past experiences and behavioral routines that develop as a result knowledge

20 applied to innumerable settings. Such a knowledge base informs the organization’s operating style and the way it responds in a changing environment. In addition to these routines and practices, an organization collects and codifies a wealth of information resources over years, and its members access and learn from these resources in order to produce more knowledge.

The relationship between individual knowledge and organizational knowledge is not static. The interaction between individual knowledge and the various forms of organizational knowledge—and the conversion from one form to the other—is what creates value in an organization. To distinguish between these two kinds of knowledge, Bhatt (2002) proposed a framework wherein the interactions and tasks associated with each are identified as independent linear concepts. The framework considers individual and organizational knowledge according to four categories (see Figure 2-2).

Figure 2-2. Relationship between individual and organizational knowledge (Bhatt, 2002).

The horizontal axis in the figure represents the nature of interactions ranging from independent to interdependent. The vertical axis represents tasks ranging from routine to non- routine. Cells 1 and 2 refer to individual knowledge, which includes discretion and expertise.

Discretion enables individuals to resolve routine problems on the spot as they occur, whereas expertise refers to the ability to perform non-routine and non-specifiable tasks that fall within the

21 purview of people considered as experts in specific areas. Task-specific expertise demands a high level of understanding of the tasks and their effects on the organization. Cells 3 and 4, each refers to organizational knowledge in a tacit or explicit form. In order to handle complex tasks, individuals need to interact and share their experience with others so that they can coordinate their tasks. In Cell 4, organizations follow formal rules and procedures to perform routine tasks.

The rules, procedures, and formal organizational structures ensure that an organization can coordinate its work processes and tasks in an efficient and orderly way. Connecting to the definitions of explicit and tacit knowledge in the prior Section 1.3 the knowledge referred to in

Cells 2 and 4, which is formalized and codified to resolve routine tasks, is in explicit form. In contrast, the knowledge in Cells 1 and 3 focuses on non-routine and non-specific tasks and is in tacit form considering the difficulty of articulating and formalizing. This model connects the two perspectives of knowledge classification: Explicit vs. Tacit and Individual vs. Organizational.

Given that action knowledge in the petrochemical industry can be categorized according to these four types, the model can be applied in the research domain in the context of the action view.

2.1.5 Connecting Knowledge Classifications to the Petrochemical Industry

In the previous three sub-sections, a number of different ways to understand knowledge are discussed, including Procedural vs. Declarative Knowledge, Explicit vs. Tacit Knowledge, and Individual vs. Organizational Knowledge. This sub-section discusses how these three classifications can be applied to the research domain of knowledge management in the petrochemical industry.

In regard to the petrochemical industry, the standard operating procedures used in most refineries today were developed over many years. They are meant to provide oversight and control of operator performance (Jamieson, 2002). Despite their varying formats, at root these

22 operating procedures consist of instructions that operators are supposed to follow (Nivolianitou et al., 2006). They represent the knowledge necessary to support operator action. The procedural focus on performing specific operations means that I can consider standard operating procedures from the perspective of action. On this basis, the knowledge considered in the present research study can be categorized as action knowledge rather than as declarative knowledge.

Over the last few decades, the petrochemical industry has made a significant effort, including investing significant resources, to construct explicit knowledge for the purpose of storing action knowledge, such as standard operating procedures, which include instructions for performing specific tasks. Operators are required to pay attention to these procedures. On the other hand, operators bring considerable judgment to bear on deciding exactly how to carry out the instructions. For example, it may be possible to follow the instructions by performing steps of a given task sequentially and yet also be in compliance by performing them concurrently. This judgment or work practice is not captured by the codified procedures, but rather reflects tacit knowledge, which is often called experience. Therefore, the action knowledge in the petrochemical industry includes both explicit and tacit forms.

Knowledge captured in procedures designed for operators to follow in performing routine tasks in refineries can be categorized as organizational knowledge, according to the knowledge categories given in Figure 2-2. Organizational knowledge is embedded in these procedures, but so, too, is operator expertise, which is classified as individual knowledge. Operators use their individual knowledge to address non-routine incidents such as accident or emergency situations that occur only rarely. Both types of action knowledge in this domain—individual and organizational—are the focus of this research.

To summarize, in terms of its scope, this research is based on the three classifications of

Procedural vs. Declarative Knowledge, Explicit vs. Tacit Knowledge, and Individual vs.

Organizational Knowledge. Figure 2-3 shows the three classifications working together to

23 address the research domain. The focus of this research is the action knowledge characteristic of the petrochemical industry. Action knowledge in this domain can be classified into four forms based on the Explicit vs. Tacit and Individual vs. Organizational perspectives. Cell 1 refers to action knowledge in an explicit form as possessed by individuals. Action knowledge of this type includes personal notes written down during operation and oral explanations on production processes. Cell 2 refers to operation expertise possessed by individuals that is not written down or articulated. Cell 3 refers to action knowledge in an explicit form captured and managed by refineries and plants as standard operating procedures and rules. In addition, other aspects of action knowledge, such as collaboration among teams and shared organizational memories regarding how to perform tasks belong to Cell 4, i.e., the tacit form of action knowledge at the organizational level.

Organizational

Cell 3 Example: Cell 4 Example: Procedures Team work practices

Explicit Tacit

Cell 1 Example: Cell 2 Example: Personal notes Operator expertise

Individual Action Knowledge

Figure 2-3. Applying knowledge classifications in the petrochemical industry domain.

24 2.2 Knowledge Management

Knowledge management (KM) is not an objective, discrete, or independent phenomenon that takes place within an organization. Instead, it is largely influenced by the social context from which it emerges and is subject to various interpretations based on organizational norms and social interactions between individuals. Knowledge management seeks to make the best use of knowledge already available to an organization and to create new knowledge in the process

(Singer & Hurley, 2005). Through this process, organizations generate value from their intellectual and knowledge-based assets.

The definition of Knowledge management varies depending on the different view of knowledge taken by researchers. The knowledge perspectives and corresponding implications for knowledge management implications are also summarized (Alavi & Leidner, 2001). Some case studies show examples of how different operator-involved organizations engage in knowledge management initiatives. For instance, one study reports on a survey that was conducted to investigate the influence of error incident characteristics on knowledge management among machine operators (Homsma, Van Dyck, De Gilder, Koopman, & Elfring, 2009).Henderson and

Cool explored how firms might learn to make better time capacity expansion decisions based on both their own and their rivals’ knowledge by assessing the databases of 72 companies operating in the petrochemical industry (Henderson & Cool, 2003). In another study, Koepsell applied ontology with applications in the domain of chemistry as knowledge assets to facilitate operator learning and practice (Koepsell, 1999).

These cases show that it is possible to investigate knowledge management at different levels of analysis. These studies also demonstrate that academic research tends to develop relatively clean and abstract approaches. Yet, even so, the studies also show that in practice knowledge management processes are messy such that the decisions pertaining to them are

25 difficult to put into crisp categories. Acknowledging this fact, this section reviews recent research related to knowledge management. Furthermore, to achieve the research goal of this thesis—that of establishing ways to effectively manage and leverage operator-level knowledge— I also aim through this review to meet two goals: (1) to offer a far-reaching perspective on the background research and (2) to re-define the research question from a knowledge management perspective.

The review starts with an exploration of the importance of knowledge management elaborated from the perspectives of both internal and external organizations. Then various definitions of knowledge management will be compared, generating the two major concerns of knowledge management, i.e., organizational strategy and information technology, followed by a discussion of how they support the knowledge management process. To connect research perspectives about knowledge as noted in the prior sections, the management issues related to action knowledge, explicit and tacit knowledge, and individual and organizational knowledge are discussed separately at the end of this section.

2.2.1 Definition and Importance of Knowledge Management

Knowledge management is a critical prerequisite to achieving organizational goals because it represents a key organizational capability that can lead to improved organizational performance. Effective knowledge management gives organizational decision-makers insight into the consequences of their actions and practices and thus provides a basis for refining and adjusting them accordingly. Organizations learn from this knowledge and make changes to their practices, strategies, and structure depending on their goals and performance history. A significant number of previous empirical studies have shown that knowledge management is linked to a number of outcomes, such as better survival (Madsen, 2009), effective acquisitions

(Zollo & Singh, 2004) and expansion (Henderson & Cool, 2003), increased customer satisfaction

26 (Lapre & Tsikriktsis, 2006), and the facilitation of innovation in terms of process re-engineering

(Robey & Sahay, 1996) and operational rule making (Haunschild & Mooweon, 2004; Lapre &

Tsikriktsis, 2006; Sullivan, 2010). Both external and internal factors can motivate knowledge management initiatives:

External: Competitive Advantage. As the market for resources has become subject to the same dynamically competitive conditions that have afflicted product markets, knowledge has emerged as the most strategically significant resource a firm possesses (Grant, 1996). Knowledge management is used to explore a firm’s potential to establish a competitive advantage in a dynamic market setting, including the role of external firm networks under conditions of unstable linkages between knowledge input and product output. As a strategy, knowledge management has become a major driver of organizational change and wealth creation. Apart from being a concept that managers can use to convert theory into practice, it can enable individuals to “know what they know.” Knowledge management aims to empower individuals and organizations to deal with real-life problems and the practical issues they face day-to-day (McLean, 2009).

Internal: Process Improvement. In terms of its impact on internal processes, Knowledge Management can shorten the proposal time for client engagement, save time, improve , increase staff participation, enhance communication, make the opinions of plant staff more visible, reduce problem- solving time, enable the firm to better serve clients, and foster accountability (Alavi & Leidner, 1999). These process improvements can be thought of as either relating to improvements in communication or as gains in efficiency. Process improvements lead to cost reductions for specific activities, increased , reductions in the workforce, higher profitability, lower levels, consistent proposal terms for clients worldwide, and better marketing results.

Another interesting conceptualization of knowledge management builds on a communication-based view, which uses concepts such as the source, channel, or recipient of knowledge to understand how individuals learn and behave in organizations (Sussman & Siegal,

2003). Knowledge management has also been studied using an interdisciplinary business model

(Malhotra, 2000) as well as a range of strategies and practices (systems, tools, and frameworks)

(Rubenstein-Montano et al., 2001) that deal with all aspects of knowledge within the context of organizations. These conceptualizations include identifying and leveraging the collective knowledge in an organization to help the organization compete, and to increase innovation and

27 responsiveness. The definitions of Knowledge Management summarized in Table 2-5 are drawn from several sources and vary according to the conceptualizations described above.

Table 2-5. Definitions of knowledge management. Definition Source The development of organizational capacity and processes to capture, preserve, (Minner, share, and integrate data, information, and knowledge to support 2001) organizational goals, learning, and adaptation. The distribution, access, and retrieval of unstructured information about (Mayer & “human experiences” between interdependent individuals or among members Sims, 1994) of a workgroup. Involves identifying a group of people who have a need to share knowledge, developing the technological support that enables knowledge sharing, and creating a process for transferring and disseminating knowledge. An organization’s structures, systems, and practices designed to facilitate (Gold, embedding, creating, organizing, and disseminating knowledge with the goal Malhotra, & of enhancing the organization’s competitiveness. Segars, 2001) Getting the right information to the right people at the right time and helping (Olla & people to create and share knowledge and act on information in ways that will Holm, 2006) measurably improve the performance of the organization.

2.2.2 Knowledge Management Strategies and Technologies

In the previous section, I considered the importance of knowledge management and compared definitions of this term. I now move to a discussion of strategies and technologies applicable to managing knowledge. Knowledge Management is regarded as a series of processes involving various activities designed to achieve organizational goals. At a minimum, a process- oriented view considers the three basic processes of creating knowledge, storing and retrieving knowledge, and sharing knowledge), which are widely accepted in both academia and industry

(Alavi & Leidner, 2001; Kucza, 2001). Broadly speaking, each process includes identifying the current state, determining needs, and proposing improvements to the process in order to address those needs (Alavi & Leidner, 2001; Kucza, 2001).

28 Although with different terms, these definitions of knowledge management point to key similarities. That is, all the definitions of Knowledge management consist of two tracks: the

“human track” and the “technology track” (Wilson, 2002). Knowledge management is realized through a series of practices relating to knowledge supported by information technologies; these practices are carried out by people within the context of organizations and are subject to organizational goals. This leads us to suggest two related yet conceptually distinct domains through which to explore knowledge management: the Organizational Strategy View and the

Information Technology View. The relationship between organizational strategy requirements, knowledge management processes, and supporting information technologies is shown in Figure

2-4, along with the ways each perspective focuses and supports these relationships. The knowledge classification outline, the research perspectives of organizational strategy and information technology, and the knowledge management processes presented here are all supported by a considerable body of research.

Organizational Strategies

integrate with…

Knowledge Knowledge Knowledge Creation Storage Transfer

KM Processes

lend support to…

Information Technologies

Figure 2-4. Understanding knowledge management processes from organizational strategy and information technology perspectives.

29

The extent to which an organization is vulnerable to extra-organizational influence is determined by the extent to which the organization depends on certain types of resource exchange in order to operate (Pfeffer & Salancik, 2003). The ability to allocate or use a resource is a major source of power for an organization, and it grows in importance when the resource is scarce. One form of control over a resource is possession, which is a form of control that is applicable to knowledge. An individual possesses knowledge in a direct and absolute way, and he or she alone decides whether and with whom this knowledge is shared. The basis for the power of professionals such as doctors, lawyers, and engineers with respect to their clients lies in their access to knowledge and information as much in their ability to interpret it. Ownership or ownership rights constitutes another way to possess a resource and, therefore, control it. The

Organizational Strategy View of Knowledge Management considers the synergy between practices relating to knowledge and the human resources necessary to achieve organizational goals. From this perspective, knowledge is treated as a resource within the organization and knowledge management features as part of the organizational strategy through which the organization achieves its goals. Over the past few decades, knowledge management programs have focused on organizing employees into communities of practice and creating repositories for

“best,” or proven practices. Three main aspects must be taken into account in these projects

(Alavi & Leidner, 2001; Kucza, 2001): management of the general conditions in an organization; assistance for the direct, inter-human knowledge management process; management of the generation, distribution, access, and use of knowledge as coded into artifacts.

On the other hand, the Information Technology View treats the needs of knowledge management as dictated by the requirements of organizational strategy and are then supported by improved technologies. Although knowledge is a cognitive idea, a physical environment is required to “load” knowledge for management. According to Drucker, knowledge by itself

30 produces nothing—knowledge only benefits organizations when it is integrated into a task

(Drucker, 1992). Most knowledge management projects have one of three aims (Davenport &

Prusak, 1998): (1) to make knowledge visible and show the role of knowledge in an organization, mainly through maps, yellow pages, and hypertext tools; (2) to develop a knowledge-intensive culture by encouraging and aggregating behaviors such as knowledge sharing and through proactively seeking and disseminating knowledge; (3) to build a knowledge infrastructure—not simply a technical system, but a web of connections between people who are given the space, time, tools, and encouragement to interact and collaborate. With these goals in mind, the practice of Knowledge Management occurs in a particular context known as knowledge management

Systems. Building on this argument, the balance between automating knowledge management and relying on people is considered to share knowledge through more traditional means (Hansen,

Nohria, & Tierney, 1999). Furthermore, the consideration concludes the codification through information systems, and opens the possibility of large-scale reuse with the primary goal of facilitating the exchange of knowledge (Nonaka, 2005).

Several technologies can be used to facilitate knowledge management, and many connectivity, document management, concept management, project management, employee portals and so-called knowledge management tools and techniques are available (Huang et al.,

2006; Trappey, Hsu, Trappey, & Lin, 2006; Wei, Yang, & Lin, 2008). For example, community computing, community networking, collaborative work, digital meeting places, electronic education, social informatics, spatial information processing, virtual communities, and are just some of the standard ways in which knowledge is managed in organizations

(Ahn, Lee, Cho, & Park, 2005; Christopher & Gaudenzi, 2009; Coakes, Coakes, & Rosenberg,

2008; Henry, McCray, Purvis, & Roberts, 2007; Phillips & Wright, 2009). One point worth mentioning here is that although technology often develops far beyond its original application, this does not necessarily mean that the “classic” knowledge engineering tools and techniques

31 described guarantee good knowledge management practices (Singer & Hurley, 2005). Knowledge management is often facilitated by information technology, but technology itself is not knowledge management. An effort to manage knowledge can only be successful if the ways in which knowledge is generated and used can be reapplied.

Next, ways to manage different types of knowledge (as discussed above) will be introduced, both from an organizational strategy perspective and from an information technology perspective. The following three subsections discuss knowledge management approaches to the three perspectives of knowledge classification: Action Knowledge, Explicit vs. Tacit Knowledge, and Individual vs. Organizational Knowledge. In the discussion of action knowledge, the concept of “Chunking” is introduced as it relates to research to cognitive psychology. In the discussion of explicit and tacit knowledge (see Section 2.4), the SECI model and two major approaches of knowledge management (interactive and integrative) are discussed. In reference to managing individual and organizational knowledge, I focus on the issue of converting individual knowledge into organizational assets.

2.2.3 Managing (Procedural) Action Knowledge

Corresponding to the action view of knowledge, this sub-section focuses on the management of action knowledge as reported in research in the fields of cognitive psychology and artificial intelligence. A case for the necessity of knowledge chunking is presented followed by a description of the chunking criteria of action knowledge and the rationale for doing so.

Research in the field of cognitive psychology reveals that human action knowledge is hierarchically structured and organized as a cognitive model in an individual’s working memory

(Turner & Engle, 1989). As information processors, people make sense of their environment and interact with it using cognitive models (Barr, Stimpert, & Huff, 1992). That is, people rely on

32 models that direct action by selectively limiting information. However, the ability to apply a model is limited by the (physiological) capacity of the working memory (Wouters, Paas, & van

Merrienboer, 2008), i.e., the brain system that provides temporary storage and manipulation of the information necessary to perform complex tasks (Baddeley, 1992). Furthermore, the capacity of an individual’s working memory is relatively fixed. The factors that determine the duration and scope of the working memory include individual differences (Waters & Caplan, 1996) and the complexity of the information to be processed (Clark & Estes, 1996; Waters & Caplan, 1996). In recent experiments, researchers have found that in regard to the serial recall of work lists, the working memory was not limited in terms of the length of the task information, but rather in terms of the number of meaningful discrete units (Chen & Cowan, 2009; Clark & Estes, 1996;

Logan, 2004). Separate task elements can be aggregated to form one specialized element as a single unit (a group of related tasks) in cognitive structures. And, schemas can be drawn from the working memory to perform this process of aggregation (Paas, Renkl, & Sweller, 2003). The evidence from previous research shows that content with well-structured and well-organized information can increase “cognitive load” and expand the capacity of working memory (Clark &

Estes, 1996; Koch, Philipp, & Gade, 2006; Waters & Caplan, 1996).

Various strategies for task analysis have been adopted to identify the architecture of action knowledge using cognitive models and to systematically organize action knowledge

(Schraagen, Chipman, & Shalin, 2000). A set of methods and techniques also have been used to specify the cognitive structures and processes used to “chunk” the action knowledge associated with performing a given task (Cooke, 1992). These strategies can be applied directly when the exact sequence of steps in a user action or user procedure is specified in a stated situation.

Through such strategies, tasks are recursively decomposed into sub-tasks according to criteria that preserve the consistency and completeness of deliverd action knowledge and that stay within some idea of what people can remember, i.e., do not go beyond the capacity of the working

33 memory (Clark & Estes, 1996). Early task analysis strategies include the GOMS model (Goals,

Operators, Methods, and Selections) (Card, Moran, & Newell, 1983) and the PARI method

(Precursor or Reason for Action, Action, Result, and Interpretation of Result) (Endsley &

Rodgers, 1994; Hall, Gott, & Pokorny, 1995). Both of these strategies emphasize the individual as the instigator of action, and both treat the task performance process as a closed system in which neither the task environment nor the transfer of information is considered. However, research has gone beyond a consideration of the mental demands involved in performing a task to take into account such matters as information input and stimuli from the environment. In addition, researchers have expanded the perceived boundaries of the task performance process by considering the interaction between individuals and the task environment (Caldwell & Garrett,

2007; Holsanova, Holmberg, & Holmqvist, 2009; Wickens & Hollands, 2000; Wouters et al.,

2008).

Extended from overall knowledge management (see Figure 2-4), managing action knowledge is also supported by information technologies. Technology is applied to make generatively modeled action knowledge accessible, and declarative semantics are used to represent action knowledge (Georgeff & Lansky, 1986). Java Script is an easy-to-use scripting language that is used to implement the modeling framework (Schinko, Strobl, Ullrich, & Fellner,

2010). Heuristics is another effective approach to accumulating action knowledge (He, Purao,

Becker, & Strobhar, 2011; Holcomb, Ireland, Holmes, & Hitt, 2009). Given the formalism of action knowledge, which additionally serves as specification languages that can be executed by machines, the formalized action knowledge are suitable for constructing complex systems. The represented action knowledge enables those complex systems to perform specific tasks to achieve given goals. Systems of this nature are used in numerous ways, including to control autonomous robots, to diagnose faults on NASA’s space shuttle (Georgeff & Lansky, 1986), and to enable the first-person perspective computer game Quaka (Konik & Laird, 2002).

34 2.2.4 Managing Explicit Knowledge and Tacit Knowledge

In this sub-section, I discuss strategies and technologies to manage explicit knowledge and tacit knowledge. The discussion starts with a consideration of two approaches to managing tacit knowledge. The purpose of the first approach is to transfer and disseminate tacit knowledge, whereas the purpose of the second approach is to convert tacit knowledge into an explicit form.

The discussion finishes with a consideration of ways to manage explicit knowledge.

Two key approaches to effectively managing tacit knowledge are mentioned in the literature: integrative and interactive (Zack, 1999). The interactive approach supports interactions between people who have tacit knowledge and focuses on the interpersonal process whereby tacit knowledge is shared. One case study of the interactive approach focuses on British Petroleum’s

(BP) virtual teamwork program (Cohen & Prusak, 1996). In order to create a reliable pool of collective knowledge, BP organized its regional operating centers into small virtual teams. Each team consists of a small number of operators who work together to perform their daily work.

Operators need to communicate with each other thereby sharing the information and knowledge necessary to perform tasks and achieve goals. The case illustrates the effectiveness of an interactive approach to tacit knowledge management by bringing experts and the situations that require their expertise together. Virtual teamwork technology can overcome the constraints of organizational size and geographic dispersion as it relates to tacit knowledge sharing.

However, loss of experts is still a threat to the interactive tacit-to-tacit knowledge conversion process. A partial solution is to transfer as much tacit knowledge as possible to another person through mentoring or apprenticeship, so that important tacit knowledge is not wholly concentrated in one person. To address this problem, researchers have begun to structure patterns of tacit knowledge and to capture them in knowledge repositories (Ward, 2000). The

35 process of converting tacit knowledge to explicit knowledge is called the integrative approach

(Zack, 1999).

Tacit knowledge is more easily exchanged between, distributed among, and combined in communities of practice when it is made explicit (Nonaka et al., 1994). However, appropriately

“explicating” tacit knowledge—especially from the originating community—for the purpose of efficiently and meaningfully sharing and reapplying it, is one of the least understood aspects of

Knowledge Management. First, information may be lost in the conversion (Zack, 1999). Second, articulating or codifying particular types of knowledge may not be culturally legitimate outside of the firm, that is, challenging what the firm knows may not be socially or politically correct in the ecology of the firm (Argyris & Schon, 1978), or the organization may be unable to see beyond its customary habits and practices (Gersick, 1990). Further, making private tacit knowledge public and accessible may result in a redistribution of power—something likely to be strongly resisted in at least some organizational cultures. Lastly, tacit knowledge may remain unarticulated because of intellectual constraints in cases where organizations have no formal language or model through which to articulate it. However, the processes of codifying, sharing, and reusing tacit knowledge can be facilitated through strategic support. Cooperation, trust, and contributions occur when people are appropriately recognized and rewarded for sharing their special tacit knowledge with others in the organization. For example, incentives to stimulate knowledge sharing are used in performance reviews at Ernst & Young (Hansen et al., 1999). The invention of multimedia computing and the hypertext capabilities of intranets have created the possibility of effectively capturing at least some meaningful fraction of an expert’s knowledge, thus making the tacit explicit (Schar & Krueger, 2000).

Unlike tacit knowledge, explicit knowledge can be effectively codified and stored in a hierarchy of databases. It can also be accessed with reliable, high-quality, fast information retrieval systems (Smith, 2001). In order to facilitate these processes, many organizations have

36 adopted an information technology called Ontology, which is capable of managing explicit knowledge in virtually all possible environments (Apostolou, Mentzas, Klein, Abecker, & Maass,

2008; Ju, 2006; Zouaq & Nkambou, 2009). More specifically, Ontology can be used to develop a set of classified entities so that all types of knowledge are included in its classifications, as well as the types of relations through which the entities are connected. For instance, enterprise ontology represents major enterprise concepts and the relationships between them. Networked and sophisticated knowledge contained in the ontology can be provided to participants engaged in activities at the unit level instead of at the individual level (Han & Park, 2009). Once codified, explicit knowledge can be reused to solve similar problems and connect people with related reusable knowledge.

2.2.5 Managing Individual Knowledge and Organizational Knowledge

Other than managing action knowledge and tacit/ explicit knowledge, the third aspect of knowledge management relevant to our discussion is the tension between individual knowledge and organizational knowledge. In the context of the organization, knowledge is often considered to be a resource or an asset. Thus, it must be “managed” just like any other resource or asset would be. By managing knowledge in a systematic form, organizations strive to understand what they know and to maximize the use of their members’ knowledge (Montoni et al., 2004).

However, it becomes critical for management to find some kind of commonality between individual knowledge and organizational knowledge and to improve the contents of the organizational knowledge base, as only some aspects of knowledge are internalized by the organization, whereas the rest remains with employees (Bhatt, 2002).

Managing individual and organizational knowledge is an important task that brings multiple difficulties with it. The implementation of knowledge management mechanisms to

37 convert individual knowledge into organizational knowledge is important to guaranteeing business success in a dynamic global economy (Montoni et al., 2004). People learn and develop individual knowledge within organizations, and this is valuable intellectual capital. However, it is difficult for people to effectively share their knowledge to establish organizational learning cycles and to turn this intellectual capital into profit for the organization. In addition, although it can be helpful for organizations to provide various resources and forms of knowledge to individuals, this can have the drawback of resulting in information overload3. People need to identify and learn to look for information that is relevant to their work. The search for relevant data can be very time- consuming, such that it can be an expensive process for the organization. Therefore, prior studies have mainly endeavored to answer the following three research questions: (1) What is the gap between individual and organizational knowledge? (2) What strategies can be adopted to manage individual and organizational knowledge? (3) What technologies can be implemented to manage individual and organizational knowledge?

From the individual knowledge perspective, knowledge is a person’s capacity to carry out a particular task. It is composed of codified knowledge and tacit knowledge, which encompasses experience, skills, and attitude. A problem specific to knowledge management occurs when the tacit knowledge “packaged” in individuals has not been transferred. The individual knowledge is considered to be a key resource (as described above), as it is the most important resource associated with an individual that does not actually belong to the organization. To determine the gap between individual knowledge and organizational knowledge, van Daal proposed a tool known as the knowledge matrix (van Daal et al., 1998). The knowledge matrix questionnaire consists of questions designed to help organizations identify, develop, share, apply, and evaluate the steps in the operational process as it pertains to the knowledge value chain. Knowledge

3 Koenig, http://www.systemdynamics.org/conferences/1998/PROCEED/00037.PDF

38 workers from different functional areas are required to evaluate each step. The authors used this tool in an organization to demonstrate its effectiveness.

Individual expertise in an organization is an asset. However, if management does not nurture individual expertise carefully, individual self-expression becomes an organizational liability. Therefore, once the gap between individual knowledge and organizational knowledge has been identified, individual knowledge sharing between organization members and transforming it into an organizational asset is important. Studies show that an organization’s ability to create knowledge depends on two factors: (1) the level of information pooling through which project-related information, explicit knowledge, and proprietary information is made readily available across the organization and (2) the extent of the individual interactions that indicate regular, collocated practices an within organization during joint knowledge-creating efforts (Berente, Baxter, & Lyytinen, 2010; Bhatt, 2002; Nonaka et al., 1994).

Information technologies are adapted to support information pooling and human interaction in an organization. Research has investigated virtual team performance during project design and the impact of virtualized design processes in complex industrial domains (Berente et al., 2010). An intranet-based knowledge management system has also been proposed to support, strategically align, and transfer knowledge resources (Braganza, Hackney, & Tanudjojo, 2009). If these tools were to be developed, individual knowledge could be captured and preserved for the organization. At the end of the knowledge acquisition process, knowledge would be stored in a community of practice repository accessible through the web-based system. Such systems would allow members to access and use knowledge items stored in the community of practice repository.

Based on further evaluation, the stored knowledge could be used to filter the intermediary base on which knowledge is shared and transferred and to identify ways in which the knowledge is relevant to the organization (Montoni et al., 2004). Filtered knowledge can effectively enhance individual learning processes when people are given access to it. Through the reuse of knowledge

39 the way an individual performs a task or series of tasks can be streamlined such that organizational performance benefits.

So far, I have considered knowledge management in regard to the three knowledge classification approaches: procedural (action) vs. declarative knowledge, explicit vs. tacit knowledge, and individual vs. organizational knowledge. Based on this discussion, the present study pursues a research direction suggested by the literature review. First, people can chunk action knowledge into meaningful units to make it easier to memorize the information and perform tasks related to it. In the petrochemical industry, procedures that rely on action knowledge are created as a combination of several units to facilitate operation.4 However, possible rationales for procedure chunking remain unclear and differ across refineries and plants.

Second, the tacit-to-tacit knowledge converting process is not the only way to manage tacit knowledge. However, innovative information technologies make the tacit-to-explicit knowledge converting process more efficient and effective. These strategies and technologies could be applied in the petrochemical industry to capture the individual’s expertise. And, third, capturing individual knowledge for the organization and providing accurate organizational knowledge to individuals are critical for managing individual and organizational knowledge. These two issues are also critical in the petrochemical industry, as operators’ behaviors and experience are hard to capture and articulate, whereas operators also find that it is difficult to quickly locate and target correct procedures in a database that stores information about a large number of procedures.

These findings, as reported in the literature, remain accurate in regard to the current condition of the petrochemical industry and are considered in the present study.

4 Examples of chunked procedure can be found as: http://www.des.umd.edu/ls/sop/chemicalSOP.pdf; and http://www.docstoc.com/docs/26590112/STANDARD-OPERATING-PROCEDURE-Wearing-of-Protective-Clothing

40 2.3 Conclusion

In this literature review, I reviewed research related to the three main concepts of knowledge and knowledge management. The first section focused on the concept of knowledge, specifically the definition of knowledge, the relationships between data, information, knowledge, and wisdom, and various approaches to classifying knowledge. Three knowledge classifications were considered in some depth: procedural (action) vs. declarative knowledge, explicit vs. tacit knowledge, and individual vs. organizational knowledge. I explored the ways in which these three knowledge classifications are important for the specific research question and how they connect to the research domain. And, based on that, exploration I refined the research agenda of this study in order to focus on the individual and organizational level of action knowledge in both explicit and tacit forms. These classifications of knowledge are also relevant to the research scope of this study, which will be considered in the next chapter.

The second section focused on knowledge management. I defined the concept of knowledge management before exploring the importance of knowledge management in the organizational context. Then, I discussed two major knowledge management solutions, i.e., organizational strategy and information technology, followed by a consideration of knowledge management corresponding to the three perspectives of knowledge classification as noted in the first section. The first sub-section discussed managing action (procedural) knowledge, suggesting knowledge chunking as a strategy. The second sub-section discussed managing explicit and tacit knowledge. Strategies to facilitate knowledge conversion between explicit and tacit are also discussed. The last sub-section addressed issues pertaining to managing individual and organizational knowledge, including the gap between individual and organizational knowledge, and strategies for transforming individual knowledge in an organizational asset, and technologies to support this process.

41 The literature review provides a basis for understanding the research question from a knowledge management perspective. I refined the research question to read as follows: “How should explicit and tacit action knowledge at the individual and organizational levels be managed and leveraged in the petrochemical industry?” Based on the background provided by the literature review, the overall research question is broken down further according to two directions: (1) managing explicit action knowledge in procedures at the organizational level in the petrochemical industry and (2) managing tacit action knowledge at the individual level that operators draw on in in performing their daily tasks in the petrochemical industry. Sub-questions are generated for these research directions and discussed in the next chapter.

Chapter 3

Research Questions

This research investigates how to leverage and manage operational knowledge in the petrochemical industry. I focus on operational knowledge that is codified in procedures as well as stored within operators’ minds. As noted in the previous chapter, I adopt three key conceptualizations in this research: the action view of knowledge (Bera & Wand, 2009), the view that distinguishes between explicit and tacit knowledge (Nonaka & Konno, 1999), and the view that distinguishes between knowledge at the individual level and at the organizational level

(Barney, 1986; Bhatt, 2002). Following these three perspectives, I select the research scope, refine the research questions and sub-questions, and investigate operational knowledge in the petrochemical industry in three essays, each of which addresses one of the sub-questions.

3.1 Key Conceptualizations

The three key conceptualizations I use for this research refer to the nature of knowledge: the object of knowledge, the format of knowledge, and the owner of knowledge. Connecting to the prior literature review chapter, these aspects relate to conceptualizations of action knowledge, explicit vs. tacit knowledge and individual vs. organizational knowledge, respectively.

The first conceptualization deals with the object of knowledge. For example, prior work describes knowledge as the theoretical or practical understanding of a phenomenon of interest

(Shulman, 1987), which answers the questions of “what” and “how” (Clark & Estes, 1996).

Knowledge as justified true belief answers the “what” questions. On the other hand, knowledge to support effective action answers the “how” question. The object of knowledge in this case is the

43 action, task, or procedure. Described in more recent research as action knowledge (Bera & Wand,

2009), it refers to knowledge about actions, tasks, and procedures. This conceptualization maps directly against operational knowledge in the petrochemical industry. It answers the question of

“how to perform tasks,” and is considered action knowledge in this research.

The second deals with the format of knowledge, that is, how this action knowledge is stored and expressed. For example, over the last few decades, the petrochemical industry has spent significant effort and resources on establishing standard operating procedures that include instructions for the performance of specific tasks. Operators are required to pay attention to these procedures. On the other hand, operators bring considerable judgment to bear on deciding exactly how to perform the procedures. For example, some of the instructions can be carried out either sequentially or concurrently. This judgment or work practice is not captured by the codified procedures. It reflects the kind of tacit knowledge that is often referred to as experience. The two forms of operational knowledge—one captured in procedures and the other possessed by operators—correspond to Nonaka’s classification of explicit knowledge and implicit knowledge, respectively (Nonaka & Konno, 1999).

The third focuses on the owner of knowledge, that is, the person who possesses this action knowledge. In the petrochemical industry, knowledge is possessed at different levels.

Procedures are managed by the organization, i.e., the petrochemical plant or refinery. There are formal processes defined by the organization to create, store, revise, and abolish procedures. At the same time, operators build their own experience and expertise at the individual level.

Experience and expertise can be built during either or both the procedure-learning process or the task-performing process. Further, there are no restrictions, other than their own interest or lack thereof, on the operators in regard to updating and sharing their experience during daily operations and training processes. The operators can also determine the specific experience from their own knowledge pool to apply to performing their work. The different possessions of

44 knowledge belonged to petrochemical refineries and operators matches the knowledge classifications of individual and organizational (Barney, 1986; Bhatt, 2002). Action knowledge at these two levels exists in the petrochemical industry concomitantly and supports the operators in terms of performing their tasks together.

3.2 Research Scope

The key conceptualizations of this research are discussed above: action knowledge, explicit and tacit knowledge, and individual and organizational knowledge. In Chapter 2, Figure

2-3 shows the three classifications working together to address the research domain—the petrochemical industry. In terms of the action knowledge in this domain, there are four forms: explicit vs. tacit and individual vs. organizational. The petrochemical industry offers examples of each form of action knowledge. For instance, standard operating procedures, rules, and training manuals are explicit forms of action knowledge at the organizational level, whereas personal memos written down during operation and articulations made while training newcomers are explicit forms of action knowledge at the individual level. Further, operational expertise possessed by individuals that is not written down or articulated constitutes a tacit form of action knowledge at the individual level, whereas collaboration among teams and shared organizational memories regarding how to perform tasks are tacit forms of action knowledge at the organizational level.

In a typical petrochemical plant or refinery, there are hundreds of procedures, each consisting of multiple instructions. Operators refer to these procedures when performing tasks.

On one hand, operators find that it is difficult to identify the appropriate procedure to follow from hundreds or even thousands of procedures when performing tasks. And, it is also difficult for them to quickly identify the target instructions within those procedures. Furthermore, the

45 organization cannot manage these procedures as assets. On the other hand, operators take actions to perform their daily work that include the steps specified in the procedures but that go beyond to include expressions of judgment as well as work practices. Based on their experience, the operators accumulate expertise, which significantly influences their work beyond the steps specified in the standard operating procedures. I recognize two long-standing problems that are germane to this case. First, operators find that it is hard to articulate and transfer expertise (Witten

& MacDonald, 1988). Second, it is difficult to understand the specific components of what constitutes this expertise. Therefore, in this research, I focus on standard operation procedures and operation expertise possessed by operators to investigate action knowledge management.

Figure 3-1 captures the research scope for this study using the three key conceptualizations outlined above. Considering the two major examples of the focal action knowledge, i.e., standard operating procedures and operator expertise, the knowledge referred to in the shaded cells is the research object of this study.

Organizational

Example: Example: Team Procedures work practices

Explicit Tacit

Example: Example: Personal Operator memos expertise

Individual Action Knowledge

Figure 3-1. Research scope.

46 In addition to the key conceptualizations, the range of the research domain and the period during which the research is conducted are also considered when determining the scope of a research study. Managing procedures and operation expertise is a critical issue not only for any given petrochemical plant or refinery, but also for all the plants and refineries throughout the industry. Therefore, procedures and operators from multiple plants and refineries are considered in the present study. For each petrochemical plant/refinery, action knowledge in different formats is transferred from one to another all the time, and information technologies and management strategies have been developed to support these transfers. However, this research is not a longitudinal study. Instead, the focus is the condition of the petrochemical industry at this point of time chosen because this study draws on all past efforts in regard to action knowledge management in the petrochemical industry and seeks new improved approaches in this area.

3.3 Research Questions

The research question of this study is: How should action knowledge be leveraged and managed in process industry? I chose the petrochemical industry as the research domain, as representative of processing industries. The research scope suggests three sub-questions generated from the original research question (see Table 3-1). The dissertation is structured as three essays to address these three sub-questions.

47 Table 3-1. Research sub-questions.

Overall Research Question: How should explicit and tacit action knowledge (at both the individual and the organizational levels) be managed and leveraged in process industries? Setting: Petrochemical industry Level of Analysis: Individuals and organizations Sub Questions: 1. How can explicit action knowledge embedded in procedures be extracted and chunked such that knowledge of this nature can be managed as an organizational asset? 2. Is the chunking process designed for explicit action knowledge effective? 3. What strategies do operators use to apply action knowledge – both tacit and explicit - as they engage in the performance of tasks?

Three essays are constructed to investigate the three sub-questions. Essay 1 deals with explicit action knowledge at the organizational level, which is codified and expressed as procedures. To solve the problem: “How can explicit action knowledge embedded in procedures be chunked in order to manage this knowledge as an organizational asset”, I follow a strategy. I propose a heuristic-based approach which extracts and chunks action knowledge from procedures. In this essay, I explain why procedure chunking is a suitable approach to managing and leveraging explicit action knowledge, developing heuristics to extract and chunk action knowledge with learning mechanisms, and implementing them as a software tool for evaluation. The heuristic-based approach and the learning mechanisms provide a semi- automated approach to procedure chunking.

Essay 2 evaluates the approach proposed in Essay 1 via the tool that implements the approach. The input to the evaluation is a set of authentic procedures contributed by multiple refineries. The evaluation allows characterizing the contribution from each phase of the methodology. I rely on assessment of expert operators for the accuracy of the final extracted action knowledge and feasibility of the chunks produced by the final procedure.

Essay 3 deals with both explicit and tacit action knowledge at the individual level. I adopt a modified version of the grounded theory approach for the research reported in this essay. I use

48 the critical incident technique (Flanagan, 1954) to extract and strategies adopted by operators to use action knowledge, and how they have been influenced by operator experience. Based on this data, a framework of strategies to use action knowledge is generated. This essay also allows an exploration of the two forms of action knowledge – tacit and explicit - in the context of the petrochemical industry. Figure 3-2 outlines how these essays are related. Details of the three essays appear in the following chapters.

Essay 1: Design an approach to extract and chunk action knowledge from operator procedures Essay 3: Determine what strategies are Goal: Make sure the tool can used by operators to apply action knowledge when performing tasks achieve the design purpose

Essay 2: Evaluate the accuracy and feasibility of designed approach Goal: Understand strategies to Goal: Make explicit apply action knowledge knowledge more accessible

Research Question: How should operational knowledge be leveraged and managed in the petrochemical industry?

Figure 3-2. Three essays in this research study.

Chapter 4

Research Methodology

I have outlined my research questions and research objectives in Chapter 3. The purpose of this chapter is to briefly introduce the research methodologies adopted for these research questions. Research methodology is concerned with the way in which research is conducted to answer the research questions. It contains multiple issues including philosophical perspectives

(important assumptions about the way to view the world), research methods (principles, procedures and steps), and research techniques (computational and analytical). For each research essay (which focuses on one research question), I illustrate my rationales and final choices on those issues as following. This chapter is meant to outline only the broad contours of the research methodology for each essay, providing a rationale for the choice and positioning the work in the context of a methodological paradigm. Additional details of the methodology along with specific techniques and details are elaborated within each essay that follows because describing those details without the context of work undertake in each part is difficult for both the author and the audience.

4.1 Research Methodology for Essay 1

The research question for Essay 1 is “How can explicit action knowledge embedded in procedures be extracted and chunked such that knowledge of this nature can be managed as an organizational asset?” The research perspective to investigate this essay is artificialism

(Freund & Baltes, 2000). A researcher following the artificialism perspective believes that events are caused by human activity or that they are caused for human purposes (Wenger, 2001). A

50 foundational view for this perspective is the work titled ‘Sciences of the Artificial’ by Simon

(Simon, 1996). Based on this perspective, making the artificial means to build something on the basis of a refined model of the exemplar and of its essential performances (Negrotti, 2001). To address the research question in Essay 1, I generate a new procedure chunking method and a tool as the artifacts to manage the explicit action knowledge. The new procedure chunking method is created and refined from existing design science techniques, and it expresses explicit action knowledge in a different way from original procedures.

Organizational

Example: Example: Team Procedures work practices

Explicit Tacit

Example: Example: Personal Operator memos expertise

Individual Action Knowledge

Figure 3-1. Research scope.

The research method for Essay 1 is, therefore, design science (Hevner et al. 2004).

Design refers to a goal-oriented process. And the goal is solving problems, meeting needs, improving situations, or creating something new or useful (Friedman, 2003). Design science research involves both the design of novel artifacts and the analysis of the performance of such artifacts for the improvement purpose (Vaishnavi & Kuechler, 2004). The outcomes of this essay

51 represent a “method” and “instantiation” following March and Smith (March & Smith, 1995) that clearly include a technological solution. The method outcome refers to a heuristic approach to perform the knowledge-chunking task, and the instantiation outcome refers to a tool to operate this approach.

Techniques of design science research may come from different domains, including natural sciences, social and behavior sciences, creative and , technology and engineering, and so on (Friedman, 2003). For my research, I develop a heuristics-based approach to analyze procedures by leveraging their components from structural, syntactic and semantic views (Newman, Pancheva, Ozawa, Neville, & Ullman, 2001), including the part-of-speech tagging process (Charniak, 1997) to identify elements of each instruction. A supervised learning model (Nielsen, 1998) is used to improve the heuristic performance based on user feedback in the dynamic chunking process.

4.2 Research Methodology for Essay 2

The research question for Essay 2 is “Is the chunking process designed for explicit action knowledge effective?” The research perspective adopted for this essay is positivism.

Positivists generally assume that reality is objectively given and can be described by measurable properties which are independent of the observer (researcher) and his or her instruments (Myers,

1997). Information Systems research is positivist if there is evidence of formal propositions, quantifiable measures of variables, testing of hypotheses, and the drawing of inferences about a phenomenon from the sample to a stated population (Orlikowski & Baroudi, 1991). In this essay,

I make a proposition that action knowledge chunks are clustered from procedure tagging results, and operators’ feasibility assessments on these chunks reflect their effectiveness. Then I evaluate the knowledge-chunking tool by collecting quantitative data of the correctness of procedure

52 tagging results, as well as operator feedback on final knowledge chunks generated from these tagging results.

The research methods used in Essay 2 includes multiple evaluation techniques from design science (Gediga, Hamborg, & Düntsch, 2002). In iterative development process, formative evaluation is conducted to assess utility and efficacy of action knowledge chunking approach as well as its artifact of SPA. The benefit of formative evaluation is to offer constructive information for changing the system design in a direct manner. Then at the end of development, I use a summative evaluation to compare SPA to other alternatives and evaluate formalized knowledge

(heuristics) about SPA. Such summative evaluation is conducted to demonstrate that the implemented SPA achieves pre-defined design objectives. .

The research method of formative and summative evaluation in design science research discussed above instructs the techniques used in Essay 2. The techniques adopted in this essay include case descriptive illustration, expert involved assessment, and comparison with other alternatives. First, I describe an example of extracting action knowledge chunks from procedures by SPA to demonstrate use of SPA with the technique of descriptive illustration (Cleven, Gubler,

& Kai, 2009; Hevner, March, Park, & Ram, 2004). Next, I draw upon a panel of experts (Nielsen,

1994; Scholtz, 2004), along with ongoing construction to conduct a small scale preliminary study in order to evaluate feasibility, time, cost, adverse events, and effect size in an attempt to predict an appropriate sample size and improve upon the evaluation design. Last, I operationalize the summative evaluation efforts as a comparison of the core constructs of SPA against those in other approaches for action knowledge extracting and chunking (Manning, Raghavan, & Schütze,

2008).

53 4.3 Research Methodology for Essay 3

The research question for Essay 3 is “What strategies do operators use to apply action knowledge – both tacit and explicit - as they engage in the performance of tasks?” The research perspective adopted for this essay is interpretivism. Interpretivists contend that only through inter-subjective interpretation, reality can be fully understood. Interpretivism perspective in IS aims at “producing an understanding of the context of the information system, and the process whereby the information system influences and is influenced by the context" (Walsham, 1993).

Interpretive research does not predefine dependent and independent variables, but focuses on the full complexity of human sense making as the situation emerges (Kaplan & Maxwell, 1994). In this essay, I focus on the complex situation of operators using action knowledge in working environment. The explicit part of action knowledge is contained in existing procedure management system, while the tacit part is possessed by operator individuals. To understand what strategies used by operators to apply action knowledge, interpretivism is adopted as the research perspective of this essay.

In this essay, I attempt to extract action knowledge from operators’ work practice. Much prior work in this domain is aimed at surrounding concerns such as finding human errors (Tasca,

1989) or understanding tacit knowledge (Madsen & Mikkelsen, 2012). Without existing studies to build upon, the research methods used in Essay 3 is grounded theory development. Grounded theory development is a research method that seeks to develop theory grounded in data, systematically gathered and analyzed (Eisenhardt, 1989; Yin, 2009). The benefit of the grounded theory approach is that the resulting theory is intimately tied to the evidence (Eisenhardt, 1989).

Grounded theory approach are commonly used in the IS research because the method is extremely useful in developing context-based, process-oriented descriptions and explanations of the phenomenon (Myers, 1997). To achieve the research objectives of this essay, I collect and

54 analyze data of operators’ real task performing cases, and create an action model to describe operators’ task-performing behaviors and behavior trends of using action knowledge.

I use the face-to-face interview as the data collection technique for Essay 3. I use the critical incident technique during the interview to facilitate their potential usefulness in solving practical problems (Flanagan, 1954). For the purpose of seeking operators’ cognitive activities to select action knowledge for the purpose of performing complex tasks in the work domain, I give a background of critical incident scenario at the beginning of interview, and require operators to narrative their decision-making and action-performing processes to response to this critical incident.

To summarize this chapter, all research perspectives, methods, and techniques for each essay are organized in Table 4-1. The details of research implementation for each research question are introduced in following essays.

Table 4-1. Research methodology for each essay.

Research How to extract and Is the chunking approach What strategies are used Question chunk explicit effective by operators to apply action knowledge action knowledge Perspective Artificialism Positivism Interpretivism Method Design science Formative and summative Grounded theory evaluation in design science Technique Heuristics, Part-of- descriptive illustration for Interview, Critical Speech tagging, demonstration, expert panel incident technique Learning model of assessment, comparison with other alternatives

There is one important issue, which I have to address at the end of this chapter, to avoid any potential confusions or misunderstandings. Researchers seldom adopt multiple research perspectives in one study. However, there are three sub-questions in my study, which require exploration of my overall research interests from three aspects. Investigations of these three sub-

55 questions are distinct (and can be considered independent) pieces of work. Therefore, I use the research methodology that is most appropriate for each piece of work. The specific details of each research methodology (each column in Table 4-1) are described in the essays that follow.

Chapter 5

Essay 1 – Semantic Procedure Analyzer: Extracting and Chunking Action Knowledge from Operator Procedures

Chapter Organization

In this essay, I describe a novel approach to extracting action knowledge from operator procedures and chunking it. An early version of this work was published in Conference of Design

Science Research in Information Systems and Technologies (DESRIST) 2011 and nominated for the Best Student Paper award. In that early version, I proposed the heuristics for and outlined an approach to extracting action knowledge and chunking it in the context of operator procedures.

However, since that time, the work has been extended, such that it is now possible to perform these heuristics using a software tool (an instantiation produced via a design science approach)

(Hevner et al. 2004). The implemented tool has been presented to industry partners, including a number of refineries, involved in the current research project. Therefore, in this research essay, I present a completed design science research study, which contains both heuristic approach and developed tool. This essay includes a description of the research scope, the underlying justificatory knowledge, the heuristic approach itself (which includes multiple phases), and the physical instantiation of this approach—a tool I refer to as the Semantic Procedure Analyzer

(SPA). For completeness and clarity, the essay is written in such a way that familiarity with the earlier version of my work is not necessary. Selected content, including the introduction and the literary review presented herein draw on the earlier research. However, the content of these sections is tailored to the current context.

57 5.0 Précis

Operating procedures are widely used in process-oriented industries. I conceptualize these as containers of knowledge that guide operator action, i.e., action knowledge (see Figure 5-

1). This action knowledge represents a key organizational asset that can be used to encourage adherence to best practices, to deliver real-time and relevant instructions to operators, and to provide a basis for designing training programs. Effective use, reuse, and management of the thousands of procedures that support operations, however, present significant challenges because of problems such as differences in semantics, granularity, and scale of contents in procedures. Following the design science paradigm, I develop an approach to the semantic analysis of procedures in order to identify action knowledge chunks and then extract them from the procedures. This approach employs heuristics to exploit the structure of instructions by parsing each into parts-of-speech. Intra-procedure semantics based on place, role, and temporal breaks as well as inter-procedure commonalities based on mapping across parts-of- speech are then used to identify and derive action knowledge chunks from the procedures. I implement the approach using a tool I refer to as the Semantic Procedure Analyzer (SPA), which includes a learning component. Using authentic procedures provided by industry partners, I describe multiple cycles of iterations to develop the approach. The essay concludes by (1) highlighting the research contributions as principles for designing systems for the semantic analysis of organizational procedures and (2) pointing to the implications of our work for practice in such areas as the management of knowledge contained in procedures within an organization or across an industry, the delivery of real-time instructions to operators, and the design of training programs.

58

Organizational

Example: Example: Procedures Team work practices

Explicit Tacit Example: Example: Personal Operator memos expertise

Individual Action Knowledge

Figure 5-1. Research objective for Essay 1.

5.1 Introduction and Motivation

Work procedures are pervasive in organizations (Stern & Barley, 1996). Scholars have conceptualized procedures as codified instructions (Woodward, 1965), as practical actions that require skills and tacit know-how improvised by employees (Suchman, 1987), and more recently, as collective know-how created and shared through informal communities of practice (Wenger,

1998). Sufficient and appropriate procedures are also considered to be very important in terms of preventing accidents (Haavisto & Remes, 2010; Nivolianitou et al., 2006). In fact, in process industries (such as the petrochemical industry), the use of procedures is directly linked to the prevention of human error, which is implicated in 70 to 80% of operating incidents (Attwood &

Fennell, 2001). Through these conceptualizations, the idea of procedures as containers of knowledge persists. In particular, procedures contain action knowledge (Bera & Wand, 2009) designed to guide operator action.

59 Against this backdrop, a number of interesting research questions can be identified.

Significant among these are questions related to the prevention of human error, to organizational routines and procedures, and to knowledge management. Of these, the first area is addressed by a rich stream of research that draws on the cognitive perspective in an effort to investigate several aspects of safety and human error (Clarke & Short, 1993; Kletz, 2001; Tasca, 1989). The second is tackled by a separate stream of research by organizational scholars who investigate organizational routines and procedures (Brodbeck, 2002). The third, with a focus on knowledge management, considers how to bridge tacit knowledge and explicit knowledge in organizational contexts. To the best of my knowledge, scholars have not addressed the focal question of the present research study, which pertains to managing knowledge codified in procedures as an organizational asset. I am primarily interested in developing an approach to the semantic analysis of procedures with the related goals of generating a structured repository of action knowledge and creating reusable chunks of action knowledge. The specific research question I address, therefore, is the following: How can explicit action knowledge embedded in procedures be extracted and chunked such that knowledge of this nature can be managed as an organizational asset?

The foundational perspectives that guide the present research include a knowledge management perspective (Alavi & Leidner, 2001; Singer & Hurley, 2005) and an action view of knowledge (Bera & Wand, 2009; Blosch, 2001). More specifically, I employ a variation of the design science approach (Hevner et al., 2004; March & Smith, 1995) in conjunction with elements of action design research (Sein, 2011) shaped by my ongoing relationship with several industry partners. Thus, the present study developed through a series of cycles in which the evolving outcomes were subjected to formative evaluation (Scriven, 1991), which in turn led to the establishment of design principles. The outcomes of research conducted in this way are described as (a) an approach (similar to the idea of a method) accompanied by (b) an instantiation

60 (in this case, a software tool) (March & Smith, 1995). A set of heuristics contributes to the approach. These are implemented with the help of a parts-of-speech tagger (Charniak, 1997). A supervised learning component (Nielsen, 1998) ensures that the naïve operationalization of the heuristics improves. The software implementation produces well-formed XML files that capture the semantic structure of the action knowledge and the action knowledge chunks. The key contribution of this research study is, therefore, a semi-automated approach to extracting action knowledge representing it into chunks. I suggest that these outcomes provide opportunities to more effectively manage the action knowledge embedded in procedures as an organizational asset as well as opportunities to reuse such knowledge in order to effect improvements in terms of real- time delivery to operators and in terms of training program design.

The remainder of this essay is organized as follows. In Section 2, I briefly review prior the literature on knowledge management and following recent work by Bera and Wand (2009) I argue for and develop a conceptualization of the knowledge in the procedures as action knowledge. Section 3 introduces the design science research method and the variations of this method that I relied on to guide my approach through numerous iterations of this research study. I describe an overview of the approach to extracting and chunking action knowledge in Section 4, including the heuristics used for this purpose. I also outline the learning mechanisms aimed at improving knowledge extraction and chunking and the implementation efforts. Section 5 concludes with an outline of the present study’s contributions to the field, implications for practice and the potential applications of the approach proposed herein, and suggested directions for future work together with a discussion of limitations and caveats.

61 5.2 Context and Prior Research

5.2.1 Procedures in the Petrochemical Refining Industry

The operating procedures used in most process industries are the result of many years of work. This is particularly true in the petrochemical refining industry, which provides the context and domain for the present study. The petrochemical refining industry in the contiguous United

States is made up of approximately 150 refineries, each with about 250 operators.5 These operators monitor and control the work of the refinery. The image of the early years of oil exploration in Texas (often thought of as a Wild West approach) has given way by now to more systematic approaches. Instead of a seat-of-the-pants approach, there is a recognition of the need for greater consistency and safety (Savage, 2009), which in recent years has been translated into an urgent demand to codify the knowledge contained in procedures because of an impending wave of retirement in the industry (Strahan, 2005). As a result, operating procedures are now increasingly recognized as the backbone of much of the work that takes place in the petrochemical refining industry. Such procedures, which cover the oversight and the control of operator performance, capture the work required of a team of field and console operators

(Jamieson, 2002). For example, these procedures describe regular operations as well as events such as starting up and shutting down that may not occur very frequently.

Each procedure, thus, represents years of wisdom accumulated by operators codified in a document as a set of instructions with extra domain information (Helfat & Raubitschek, 2000). In fact, often written by expert operators, procedures in a modern refinery are maintained and modified as necessary following a management-of-change process designed to ensure that any changes are vetted by people in multiple roles before any procedures are officially modified

5 Number and Capacity of Petroleum Refineries: http://www.eia.doe.gov/dnav/pet/pet_pnp_cap1_a_(na)_8OO_Count_a.htm

62 (Walter, Foottit, & Nelson, 2009). Overall, the procedures describe the work that must be performed in a range of situations throughout the refinery (see Figure 5-2).

Figure 5-2. An excerpt from a procedure at a refinery.

A refinery generally manages anywhere between one hundred to several hundred procedures, each relating to a specific piece of large equipment such as a hydrocracker. In many refineries, the procedures are available in binders that store word-processed documents, each of which details the instructions that a team of operators is expected to follow. In a few cases, the refineries maintain the procedures as documents (PDF or Word) available over an intranet, which allows a drilldown into the set of procedures based on location and equipment. The procedures range from a few pages to about 30 pages in length and contain information such as the procedure’s name, the date it was created and its change history together with a detailed set of instructions for operators to follow (Nivolianitou et al., 2006). Some procedures contain additional information such as goals that must be achieved and conditions that must be maintained. In some refineries, operators carry copies of some of the procedures, which include places (such as a box to tick) where operators are to indicate that they have completed certain tasks. This serves the important purpose of communicating across operators because the project

63 of carrying out numerous procedures requires actions from numerous operators and because some procedures, such as a shutdown operations, are carried out over several days (Sahoo, 2013).

Based on my observations, procedures vary significantly in terms of format and language across organizations, across locations, across refineries and even across equipment in the same refinery.

As a result, the knowledge contained in the procedures is hard to capture, extract, reuse, or examine beyond its immediate use by operators following procedures in their day-to-day work.

Even intranet sites that capture all or most of a refinery’s procedures rarely venture beyond the metaphor of a collection of files.

5.2.2 Knowledge and Knowledge Management

I conceptualize procedures as containers of explicit codified knowledge (Nonaka et al.,

1994). “Knowledge” is a multi-faceted concept that can be traced to the philosophical work of

Aristotle and Plato (Jori, 2003), who argued that knowledge represents “justified true belief”

(Chisholm, 1982). Contemporary definitions proposed by scholars have expanded this definition to include technologies for storing knowledge, resources for generating knowledge, and the contexts in which knowledge is applied (Cook & Brown, 1999; Schultze & Leidner, 2002;

Simpson & Weiner, 1989). Table 5-1 shows a summary of classifications proposed by scholars.

64 Table 5-1. A selective summary of classifications of knowledge proposed in prior research. Dimension Definition Procedural vs. Procedural knowledge is “know-how” knowledge expressed in expert Declarative systems with rules, or (in organizational life) with procedures. (Anderson, 1976, Declarative knowledge is more descriptive “know-what” knowledge 2009) represented by objects or agents in new programming languages. Tacit vs. Explicit Tacit knowledge is the comprehensive cognizance of the human mind (Hasher & Zacks, and personal judgments rooted in thoughts and actions that is hard to 1984) formalize and communicate. Explicit knowledge is knowledge that has been or can be externalized by articulating, coding, and communicating using some symbolic system. Individual vs. Individual knowledge is things an individual can know, learn, and Organizational express. (Barney, 1986; Organizational knowledge is linked individual knowledge integrated Bhatt, 2002) and shaped by organizational history and culture and possessed by an organization. Deep vs. Surface Deep knowledge refers to models and causal explanations that go back (Webb, 1997) to natural laws (commonly admitted in AI). Surface knowledge is represented by practical rules that can be acquired from people. Internal vs. External Internal knowledge is adapted to the specific needs of an organization. (Kucza, 2001) External knowledge functions on a more general level and must be adapted before it can be utilized inside an organization. General vs. Domain General knowledge is knowledge that is both domain- and site- (Gao & Sterling, independent. 2001) Domain knowledge is knowledge that is true in a particular domain but that may not be generalizable to other domains. Scientific vs. Social Scientific knowledge is knowledge about the hierarchy and unity of the (Hoon & Derick, universe. 1994; Osman, 1998) Societal knowledge is knowledge that enables people to understand and predict general patterns of behavior on the part of others.

Of these classifications, the most relevant to the present research study are the distinction between explicit knowledge and tacit knowledge (discussed in the present sub-section) and the distinction between declarative knowledge and procedural knowledge (discussed in the next sub- section).

Knowledge created and used in the petrochemical industry can either be recorded as procedures or stored in the operator’s mind. These two forms of knowledge map onto Nonaka’s classification of knowledge as either explicit or tacit (Nonaka, 1994). The distinction between

65 these two forms presents a foundation for understanding how to manage knowledge. Tacit knowledge refers to knowledge that is difficult to transfer to another person in written or oral form (Hasher & Zacks, 1984). Knowledge of this nature is acquired via experiences, routines, values, or emotions (Schon, 1987). An often-used example is the acquisition of the ability to ride a bicycle (Dreyfus et al., 1987). Although tacit knowledge is difficult to articulate, scholars have found that it can be shared through such means as conversation and storytelling under some conditions (Zack, 1999). On the other hand, explicit knowledge is knowledge that has been externalized through articulation and codification (Hasher & Zacks, 1984) in the form of data, scientific formulas, specifications, or manuals. Such knowledge is playing an increasingly large role in organizations and is considered the most important factor of production in the knowledge economy (Romer, 1995). The information presented in textbooks, manuals, and articles (such as the present study) is an example of explicit knowledge. In this study, procedures are used to capture and deliver explicit knowledge for the purpose of providing instructions to operators in the petrochemical industry. The question of how best to manage such explicit knowledge embedded in operating procedures is the focus of this study.

5.2.3 An Action View of Knowledge

There are several kinds of explicit knowledge. For example, knowledge of this nature can be expressed through declarative statements, e.g., assertions of the truth value of something, or through a description of a set of steps to complete in order to fulfill a goal. In the context of my work, the explicit knowledge expressed through declarative statements can be captured in manuals and books describing the structure of the plant or the chemical reactions underlying the material transformations in a refinery, whereas the explicit knowledge expressed as a series of steps consists of standard operating procedures meant to provide oversight and control of operator

66 performance (Jamieson, 2002). Despite being presented in a variety of formats, these operating procedures fundamentally comprise sets of instructions for operators to follow (Nivolianitou et al., 2006). That is, they impart the knowledge necessary to support operator action. The focus of procedures designed to guide operations suggests the possibility of treating the content of operating procedures from an action view. Following this argument, I conceptualize knowledge contained in procedures as “action knowledge.” This view of knowledge has roots in artificial intelligence (AI) (Newell, 1982; Newell & Simon, 1976). And, as such, action knowledge differs in terms of emphasis from knowledge, which focus on the declarative and on the notion of truth value. Instead, action knowledge represents, as its name all but states, a guide for action. Such knowledge can be formalized and represented as a collection of rules, as prescriptive statements, or as guides or norms that enable users to choose and execute a course of action. The difference is clear. Declarative knowledge emphasizes “what” or “why,” whereas action knowledge specifies

“how” and “when” (Clark & Estes, 1996). In the domain of information systems, Bera and Wand

(2009) argue that the basis for knowledge is provided not by “justified true belief” but rather by effective performance. The action view of knowledge builds on this argument. Blosch (2001) suggests a complementary view wherein the role of action knowledge in ensuring successful practice is emphasized. Drawing on this discussion, I adopt the term action knowledge to refer to the know-how in operating procedures that enables individuals to select and perform action(s) in order to change the current state to a desired target state (Bera & Wand, 2009).

The definition of action knowledge as knowledge that enables operators to select and perform action(s) designed to change the current state to a goal state requires identifying an underlying model that is conceptually complex. For example, to interpret how to select and perform any given action, elements such as a goal, the possible outcomes, and an underlying rationale must be articulated. Numerous interpretations of action knowledge identifying different attributes are set out in the literature. That is, the various frameworks and models differ in regard

67 to the sets of attributes specified as relating to action knowledge. Sometimes, even when the same attribute is identified, the definitions vary which both expresses and plays a role in inducing an inconsistent understanding across large volumes of literature. A summary of interpretations of action knowledge and its attributes from the literature is presented chronologically in Table 5-2.

A discussion of the four main attributes, i.e., action, actor, goal, and outcome, follows the table.

68 Table 5-2. Action knowledge interpretations and attributes. Interpretation of Action Knowledge (AK) Attributes Source Action, Actor, Environment AK is logic with the possible situation to take action. (Moore, 1977) (Actuality, Precondition), Goal AK is a prerequisite for an action that can be Action, Actor, analyzed as a matter of knowing what action to take; (Moore, 1985) Environment, Goal an executable description. AK is knowledge that makes action-at-a-distance Action, Actor, possible by reasoning about and taking control over Environment (Time, (Gasser, 1991) activity located at some other place in space or time Place, Resource) (such as the future). Action, Actor, (Lesperance & AK enables an agent to achieve goals even if not all Environment (Time), Levesque, information is known. Trigger 1995) Action, Actor, AK is normative constraints, criteria by which (Tsoukas, Environment behavior can be guided and assessed. 1996) (Location) AK is values of the agent’s state that allow executions to be activated, inhibited, or modified. It Action, Environment can be used to plan and monitor an action or to (Narayanan, (Time, Location, provide real-time simulative inference—Action, 1997) Resource), Goal Actor, Environment (Actuality, Precondition), and Goal. AK is the agent’s belief at different times pertaining Action, Environment (Geffner & to determining the “executability” of a policy. (Resource, Time) Wainer, 1998) Action, Environment AK governs human actions (rules and/or prescriptions (Goldkuhl, (Post Condition), for action). 1999) Goal Action, Actor, AK is knowledge that ensures the successful Environment (Blosch, 2001) completion of a task. (Resource) AK precedes and, therefore, determines action and performance; it enables one to establish a logically Action, Actor, Goal (Chia, 2003) consistent pathway between the past and the future. AK is a model that describes how to impact its Action, Actor, (Krogstie et al., interpreters and change the domain as a facilitator. Environment, Goal 2006) Action, Actor, (Hawthorne & AK is propositions as reasons for acting. Environment Stanley, 2008) AK is the ability of the agent to select an action from Action, Actor, those available to the agent when given the current (Bera & Wand, Environment state and the environment in order to change the 2009) (Resource), Goal current state to a goal state. AK enables the ability to select a course of action, Action, Actor, which can lead to a change in the state of the Environment (Freund & environment, given the state of the agent and the (Resource, Time), Baltes, 2000) environment. Goal, Outcome

69

This summary of the literature on action knowledge and its attributes provided a basis for selecting an appropriate model for the present study. The multiple definitions listed accommodate many situations in which practicing or performing a task is useful. For the purpose of this study, I have devised a simple framework that identifies four attributes—action, actor, environment, and goal—as summarized in Table 5-3. An action causes changes in state or perception. The actor is the person who possesses the requisite knowledge to execute an action and does so based on that knowledge. The environment describes the situation or circumstance in which the action takes place. The goal is the target state that motivates the actor’s behavior. Three important properties of the environment are time, location, and resources wherein Time is a temporal description at specific stages of the action, location describes the geographic place of the action, and resource refers to physical materials produced or consumed during the action.

Table 5-3. A model of action knowledge derived for this research Attribute Definition Resource Causes changes in state or perception (Freund & Baltes, Action 2000) A person who possesses knowledge and (Hawthorne & Stanley, Actor executes actions 2008) The domain, or a set of all statements that (Krogstie et al., 2006) Environment refer to the situation under consideration A temporal description of behavior used to (Narayanan, 1997) Time bind action schemas at specific stages to describe behavior The geographic distribution of actors (Gasser, 1991; Location Narayanan, 1997) Physical materials produced or consumed (Gasser, 1991; Resource during an action that maintain the actor’s Kornfeld & Hewitt, participation in a course of action 1981) The target state that motivates the actor’s (Narayanan, 1997) Goal behavior

70 5.2.4 Chunking as a Strategy for Managing Action Knowledge

The action knowledge model provides a basis for conceptualizing knowledge captured in operating procedures. This model does not, however, allow for the possibility that the operating procedure is likely to contain instructions aimed at (a) multiple goals, (b) multiple actions, (c) multiple actors, and even (d) multiple locations. Another way of saying this is that a single operating procedure is likely to consist of many action knowledge chunks and that these action knowledge chunks may be carried out by different operators (sometimes at different locations and at different times). Further, it is possible that some of these chunks may even be duplicated, in more than one context and across operating procedures. These repetitions may be necessary if several units must undertake similar sets of actions to fulfill stated goals. For example, the hydrotreater is a unit that uses hydrogen to desulfurize intermediate products after atmospheric distillation. There are at least two types of hydrotreater units in a typical refinery, one for naphtha desulfurization and another for diesel oil desulfurization. These units are similar in terms of structure and function, such that they have some operations in common.

Several problems arise when there are repetitions across procedures: first, repetitions make capturing and documenting action knowledge more difficult such that organizations spend unnecessary extra cost, time, and effort on this process. Second, organizations expend additional effort in order to maintain repeated knowledge; that is, an update to one operation practice is likely to necessitate changes to other procedures. Consistency and coordinated updates can be difficult to handle when all or some parts of a procedure are replicated in an unstructured way.

Inconsistencies can lead to confusion and mistakes on the part of operators. An innovative approach to culling, cleaning and polishing, structuring, formatting, and indexing procedures against a classification scheme is, therefore, necessary to package action knowledge for later distribution and reuse. I call this a chunking strategy: cleaning procedures are used to remove

71 non-action knowledge such that only action knowledge remains, to structure the action knowledge into small chunks, and then to manage it for use and reuse.

I define a chunk of action knowledge as a self-contained subset of instructions, extracted from a procedure (Jansen, 2010; Matos, 2008), i.e., a directly addressable and editable chunk of action knowledge. However, identifying action knowledge chunks can be a difficult task because it is difficult to follow conventional approaches such as task analysis strategies (Schraagen et al.,

2000). Although such approaches can be used to understand the underlying structure of action knowledge, it is necessary to develop associated techniques in order to generate chunks of action knowledge in order to direct task performance (Cooke, 1992). Task-analysis strategies can also be applied directly when the exact sequence of actions or procedures is specified, such as those designated in a procedure. Task-analysis strategies include the Goals, Operators, Methods, and

Selections (GOMS) model (Card et al., 1983) and the Precursor or Reason for Action, Action,

Result, Interpretation of Result (PARI) model (Hall et al., 1995), both of which emphasize the demands made on individuals, although PARI was later expanded to include interaction between individuals and the task environment. These models provided a useful starting point to the research approach developed herein and the outcomes reported in the present study.

5.3 Research Method and Research Design

The research method of this essay follows a design science approach, which emphasizes the problem-solving and performance-improving nature of the research process (Vaishnavi &

Kuechler, 2004). The goal of design research is to systematically create new knowledge by inventing innovative artifacts capable of rendering improvements to real phenomena (Hevner,

2010; Sein, 2011).

72 5.3.1 Selection of the Research Method

There are several design science research methodologies (Harnesk & Thapa, 2013), including a linear process for design science research (March & Smith, 1995), the patterns-based design science method (Vaishnavi, 2008), the notion of -evaluate cycles (Hevner,

2007), and the incorporation of potential users’ participation in the form of action design research

(Cole, Purao, Rossi, & Sein, 2005). I adopted and combined the iterative design-evaluate cycle method with the potential users’ participation method as the design research approach in this study for the following reasons.

The first research question defined for this study addresses concerns about both technology and organization. In order to understand these concerns, I engaged with representatives of multiple refineries in the petrochemical industry in an ongoing way through regular meetings and discussions. Through this access to potential users and organizations, I broadened and deepened my knowledge of the petrochemical refinery industry based on which I elaborated on and refined my research strategy in part by considering and rejecting a series of other strategies that were ostensibly viable. That is, I considered and discarded a number of ways to address the research question before arriving at the research strategy described next. These early efforts map well onto the idea of an iterative search for solutions, which is an important aspect of the design science research method.

After these early iterations, through which I determined my research focus–the use of heuristics for extracting action knowledge from operator procedures and chunking it—a second set of iterations followed. This second set of iterations was important because the specific design goals were still difficult to identify and the final requirements were not well-structured. The research process, therefore, progressed via simultaneous processes of engaging in problem- solving in order to articulate a design theory, and implementing a software artifact in order to

73 instantiate this emerging theory. A more specific account of these iterations is provided in the next sub-section.

5.3.2 Iterative Approach

With the aim of designing a heuristic approach to identifying action knowledge chunks from operator procedures, it was relatively straightforward to define the inputs as the operating procedures themselves and to define the intended, final outputs as action knowledge chunks.

However, the design space from the inputs to the outputs is characterized by a number of interdependent variables and interactions among them. Addressing the choices in this design space requires that the parameters of interest be identified and that different sequences be explored in order to address these. In effect, this “identification of parameters” and of “ways to sequence their treatment” makes it possible to systematize the iterations of theory development as suggested by Hevner (2007). To operationalize this approach, I followed the design structure method (Steward, 1981), first, by identifying the variables and the relationships between them that define the design theory (see Table 5-4). For each variable, I list the other variables (referred to as predecessors) that must be known or assumed in order to define the former. For example, according to the action knowledge chunking strategy reviewed in Section 2, the three variables of actor, distance, and time determines dependent variable 1 (the final output of the chunks).

Furthermore, in order to determine variable 2 (the distance between two sequential operation behaviors), the two pieces of operating equipment and their locations, namely variables 5 and 6, are required, and thus must first be known or estimated. I introduce these variables and their predecessors in more detail in following section, Design Theory.

74 Table 5-4. Precedence table of variables. Variable Description Predecessors 1 Chunk Chunks 2, 3, 4 2 Distance Distance between two sequential operation behaviors 5, 6 3 Waiting Time Length of waiting time between two sequential operation 7 behaviors 4 Actor Actor of operation behavior 8 5 Equipment Operating equipment 9 6 Location Equipment location None 7 Condition Condition to execute behavior 10, 11 8 Object Object of instruction 12 9 Subject Subject of instruction 12 10 Conjunction Conjunction of instruction 12 11 Waiting List Waiting list None 12 Dictionary Dictionary of keywords and their possibilities 4, 5, 7

The information in Table 5-4 can also be represented as a precedence matrix (Steward,

1981), as in Figure 5-3. The precedence matrix can then represent the structure of the emerging design theory, by specifying which variables are affected by which other variables. The diagonal is marked with circled x’s. A mark in row i column j means i has the predecessor j. For example, the mark in row 1 column 2 means the determination of variable 1 (chunks) requires variable 2

(distance). A scan across the rows shows which variables precede which variables, whereas a scan down the columns shows the order in which the variables follow. In column 4 row 12, there is a mark indicating that variable 4 is followed by variable 12. In column 12 row 8, there is a mark indicating that variable 12 is followed by variable 8. By this means, it is possible to trace the circuit from variable 4 to variable 12 to variable 8 to variable 4, as shown in Figure 5-4. Two more circuits can be found in the same way: variable 5 to variable 12 to variable 9 to variable 5, and variable 7 to variable 12 to variable 10 to variable 7.

75

1 2 3 4 5 6 7 8 9 10 11 12 1 ○x x x x

2 ○x x x

3 ○x x

4 ○x x

5 ○x x

6 ○x

7 ○x x x

8 ○x x

9 ○x x

10 ○x x

11 ○x 12 x x x ○x

Figure 5-3. Precedence matrix.

A mark in row i column j means that in order to determine variable i, it is necessary to have a value for variable j. Therefore, I cannot determine any variables within the circuits. In this case, the variables can be estimated. There are two criteria for selecting estimated variables. First,

I intend to estimate as few variables as possible, by which all other dependent variables can be determined. In Figure 5-2, the intersection point of three circuits is variable 12 (keyword dictionary of keywords and possibilities)). Therefore, I decide to estimate based on this variable.

Second, the variables are selected based on which can be estimated most accurately, e.g., the variables that can be referenced from existing archive data. For variable 12 (keyword dictionary of keywords and possibilities), I chose the Penn Treebank database, which includes syntactic and semantic information (Church & Mercer, 1993). The Penn Treebank is widely used in natural language–processing systems to annotate sentence structure. I used it as the initial estimation based on which procedure instructions could be decomposed. Therefore, once this estimation was

76 made, the other variables in the circuits could be determined: object (Variable 8), subject

(Variable 9), and conjunctions (Variable 10). Then, I performed iterations in order to evaluate and improve the initial estimation. I introduced each iteration and the product generated as described in the next sub-section.

5.3.3 Iteration 1 – Multi-Phase Approach

The first iteration consisted of a multi-phase approach to extracting action knowledge from procedures. The structure of the design outcome, i.e., the Semantic Procedure Analyzer

( SPA) is outlined in Figure 5-4. The design consists of a first phase of instruction pre-processing

(with part-of-speech tagging (POST) technique (Charniak, 1997) enhanced with a dictionary

(Variable 12 in Figure 5-3) that contains definitions and contexts of words. In this iteration, I used the Penn Treebank as the initial estimation of the dictionary. In this first iteration, the dictionary consisted of Subject, Object, and Conjunction terms identified from multiple statements in the procedures.

Identify Triggers

Procedure

Pre-processing Action Cluster Procedures (Phase 1) Knowledge Instructions Chunks

Chunk Extraction (Phase 2)

Figure 5-4. SPA structure in Iteration 1.

Discussions with potential users and managers at the refineries provided a basis for an ongoing formative evaluation of the software artifact that was being designed and implemented.

In addition, the refineries provided some actual operating procedures, which I used to assess the

77 feasibility of the software artifact. During these rounds of formative evaluation, the SPA was used to analyze procedures and suggest action knowledge chunks. The feedback from the potential users and managers indicated that they considered the idea of reading knowledge chunks as likely to be more effective than reading long pages of procedures. The formative evaluation also suggested that performance could be improved by removing spurious content (such as procedure title, author, effective date, etc.) from the operating procedures before pre-processing and by allowing the users to refine the dictionary.

5.3.4 Iteration 2 – Incorporating Learning

In the second iteration, a new phase was added before the instruction pre-processing. This phase was designed to remove spurious content from the operating procedures such that only the instructions for the next phase of pre-processing were retained. In this second iteration, users were also able to modify the pre-processing results. Another important enhancement during this phrase was the addition of supervised learning, which is a form of machine learning aimed at improving performance from labeled training data (Erickson, 2013). In supervised learning, the training data consist of a set of training examples. Each example is a pair consisting of an input object and a desired output value. A supervised learning analyzes the training data and produces an inferred function, which can be used to map new examples. In this study, the procedures and modification information from users provided the training data.

78

Identify

Triggers

Instruction Spurious Pre-processing Content Action Procedures (augmented Cluster Removing Knowledge with learning ) (Phase 1) Instructions Chunks (Phase 2)

Chunk Extraction (Phase 3)

Figure 5-5. SPA structure in Iteration 2 (new or changed phases shaded).

A second round of formative evaluation followed this iteration. During this cycle, SPA was trained with 30 procedures. The Penn Treebank dictionary was revised based on the edits, i.e., by adding a new word/phrase, adding new attributes for a word/phrase, or adjusting the probability that a word/phrase would fit the description of a certain attribute in a given environment. After the training, a potential user strategy was implemented again for the evaluation. This round of formative evaluation also showed that the accuracy of instruction pre- processing had improved after (a) the spurious content had been removed and (b) the learning mechanisms and training had been added. Suggestions for improvement based on this cycle of formative evaluation included varying the scope of learning (e.g., on the basis of equipment, location, and/or organization) and adding a visualization component to deliver the outcomes to the users in an intuitive manner.

5.3.5 Iteration 3 – Features of the Training Data and the Visual Display of Output

In the third iteration, the structure of SPA was maintained. However, new features were added to allow multiple (instead of just one) procedures to be processed and to allow the selection of a training database (e.g., all the procedures from the refinery company or only those from a

79 specific equipment such as the hydrocracker). The SPA interfaces for collecting background information and selecting the training plan are shown in Figure 5-6.

Figure 5-6. SPA for collecting background information and selecting a training plan.

Other significant improvements related to visualization, e.g., the use of a consistent user interface that would allow users to suggest adjustments to the software (e.g., right-click edits and deletes) and edits to the demarcation lines between the action knowledge chunks. Another improvement in regard to visualization included a network display based on the common relationships across action knowledge chunks. Figure 5-7 shows an example of the network display of action knowledge chunks.

Figure 5-7. Network display of action knowledge chunks with SPA.

80 A final formative evaluation cycle was performed after this iteration. This cycle consisted of working with five operators in a petrochemical refinery. The operators used SPA to follow the entire process. I found that all the operators were able to use SPA to follow the process: removing non-action knowledge from the procedures, pre-processing the operation instructions, creating action knowledge chunks, and generating graphs of chunk networks. Further suggestions from the operators included accounting for individual differences, accounting for context, and addressing levels of experience by chunking action knowledge in various ways. These were left for the next generation of SPA.

The next section outlines the design theory instantiated in SPA and describes the underlying components—developed via the multiple iterations described previously.

5.4 A Design Theory for Extracting and Chunking Action Knowledge

In this section, I describe the outcomes of the present study as a tentative design theory.

Although it is recognized that design science is informed by existing theories (Baskerville, 2008;

Kuechler & Vaishnavi, 2008), knowledge about the design of IT artifacts also exhibits the characteristics of theories (Gregor, 2006). This design process involves explicating structural components as a way to specify and communicate the design theory (Gregor & Jones, 2007;

Takeda, Veerkamp, Tomiyama, & Yoshikawa, 1990). This specification can include (1) the purpose and scope (which specify the type of artifact to which the theory applies as well as its boundaries), (2) the underlying justificatory knowledge that led to the design, (3) the solution suggestion, which includes principles of form and function that constitute an abstract blueprint of the IT artifact, (4) solution mutability, namely, the anticipated changes to the artifact encompassed by the theory, and (5) a physical instantiation (the software artifact itself) although this can be optional in the specification of an IS design theory (Gregor, 2006). This is the

81 structure I follow in describing the components of the design theory exemplified by SPA—which realizes the approach for extracting knowledge from operator procedures and chunking it.

5.4.1 Purpose and Scope – Extracting and Chunking Action Knowledge

The purpose of the proposed design theory is to support the management of action knowledge in petrochemical organizations. As argued earlier, action knowledge differs from declarative knowledge. Identifying appropriate chunks of action knowledge from operator procedures in order to manage that knowledge as an organizational asset constitutes a critical problem. I define a chunk of action knowledge as a self-contained subset of instructions extracted from a procedure (e.g., (Jansen, 2010; Matos, 2008), that is a directly addressable and editable chunk of action knowledge.

IT systems are capable of supporting the management of such knowledge (Setia & Patel,

2013). However, to the best of my knowledge, no study has formalized for systems that support the management of action knowledge, including in regard to extracting and chunking action knowledge, as in operating procedures from the petrochemical industry. This is the scope of the proposed design theorizing effort. I expect that artifacts instantiated by the design theory I propose will be appropriate for managing action knowledge in procedures as an organizational asset. I anticipate that this research will give rise to a number of other applications, such as customizing the presentation of action knowledge according to the ability, experience, and preferences of individual operator. The immediate benefit that I aim to realize, however, is that of chunking action knowledge in order to facilitate its management as an organizational asset.

82 5.4.2 Justificatory Knowledge

Justificatory knowledge refers to propositions and beliefs that provide a basis for the development of the design theory.6 The source and rationale of the justificatory knowledge underlying the design theory developed herein is derived from the structure of the action knowledge encoded in the procedures. First, I acknowledge an event as a trigger of action knowledge chunking because it separates clusters of operator performance based on an external occurrence. Second, the temporal sequence of tasks and/or their geographical separation can also require separation across task clusters; e.g., when a shutdown task is separated across multiple days and locations, it may be necessary to assess partial results. Another elaboration of this principle is suggested by Endley and Rodgers (1994) who point out that task performance relies on awareness of the environment on the part of actors who must maintain current assessments of task progress (Endsley & Rodgers, 1994). Finally, a switch in the actor, e.g., from a field operator to a console operator, suggests the need for a corresponding change in the action knowledge chunks presented. A chunking strategy operationalizes the simple principle that if a task is too large or complex to be performed by a single individual, chunking is necessary to support coordination among two or more individuals. Based on these arguments and rationale, I describe the justificatory knowledge expressed by the proposed design theory.

The first impetus for chunking action knowledge comes in the form of an event that triggers this process. Event detection is, therefore, an important component of knowledge processing and decision making (Wickens & Hollands, 2000). Events that occur in an environment provide an important input into and correspond with an individual’s awareness of, response to, and utilization of resources for managing task performance. Signal detection and vigilance are, therefore, core elements of actor awareness. As a result, they provide an important

6 http://en.wikipedia.org/wiki/Theory_of_justification

83 path to clustering instructions, i.e., action knowledge chunking—with the expectation that determining appropriate responses to a changing situation assessment or task demands will require the operator to access the relevant action knowledge chunk. My strategy for chunking action knowledge around the event includes actions such as detecting the event, responding to the event, accurately forecasting future event states, clustering the same action knowledge chunks.

This chunking strategy increases response success by providing as much time as possible from the start of an event to the point at which a response action must be taken (Caldwell & Garrett,

2007).

The second impetus for action knowledge chunking stems from the multiple clusters in time and location that define the contours of an operating procedure. Thus, an operating procedure can be viewed as made up of several action knowledge chunks. Separating these requires recognizing subsets of instructions in each operating procedure that are tied to a contiguous period of time (highly temporal in nature) or that must be performed at a particular location. Therefore, it is suggested that action knowledge chunks can be divided by focusing on different elements in the task environment or by changing the environment boundaries, namely, the continuation of time and location (Endsley & Rodgers, 1994; Wouters et al., 2008). The contiguity principle states that actions contiguous in time or space among a sequence of actions should be clustered together.

The third impetus builds on the simple idea that the tasks carried out by operators in a refinery are often too complex. As a result, these tasks cannot be performed by a single individual. Instead, they require effort from and cooperation among multiple actors (Caldwell &

Garrett, 2007). Members of such a team may sometimes work independently, providing each has a clear idea of the work required—with explicit coordination points to bring together the completed work. Individual operators can, therefore, have different responsibilities that occur at different times and in different locations. Examples of clustering and chunking suggested by this

84 impetus include an assembly-line view of the overall task (Zinn, Bowers, McPhillips, &

Ludascher, 2009) or a Gantt-style approach to managing project dependencies, which requires information and outcomes from one task to be completed before the next task is allowed to start

(Norese, 2010). Actor switching within a sequence of actions, therefore, provides a direct clue to how action knowledge should be chunked.

Based on this discussion, described next is the design of a heuristic approach to generating action knowledge chunks.

5.4.3 Solution Suggestion – A Heuristic Approach to Extraction and Chunking

The heuristic method proposed is driven by (a) the availability of attributes of action knowledge contained in the procedures and (b) observations related to justificatory knowledge that suggests principles for chunking. The attributes contained in operator procedures include the specification of operators, locations, and time-spans. Each provides possible leads for identifying boundaries around chunks of action knowledge based on the strategies suggested above. I argue for a heuristic (Pearl, 1984), instead of algorithmic, approach because the operator procedures are written in natural language, without a mandate for an industry-wide standard or sometimes even without an organization-wide standard.

Here, I first describe the overall approach as consisting of three phases, followed by the detailed heuristics. The first phase identifies action knowledge components in the procedures; the second phase parses each instruction into the action knowledge attributes; the third phase identifies subsets of instructions in order to define boundaries around action knowledge chunks.

The approach is outlined in Figure 5-8 (which is a repeat of Figure 5-5).

85

Identify

Triggers

Spurious Instruction Pre-processing Content Action (augmented Procedures Removing Cluster Knowledge with learning ) (Phase 1) Instructions Chunks (Phase 2)

Chunk Extraction (Phase 3)

Figure 5-8. A heuristic approach to extracting action knowledge from procedures and chunking it.

The approach conceptualizes each procedure as a set of action knowledge statements

(after removing the preamble based on pre-processing). The heuristics analyze the procedures by leveraging their components in terms of structure, syntax, and semantics (Newman et al., 2001).

The heuristics first use part-of-speech tagging (Charniak, 1997) to identify elements of each instruction into corresponding attributes, such as actions, actors, and resources. These attributes are then subjected to heuristics to identify and extract chunks of action knowledge. Tables 5-5 to

5-7 outline the heuristics (He et al., 2011).

In the first phase, Heuristics 1–3, a parse procedure separates procedure content with or without action knowledge. The first heuristic scans the procedure to locate its title. A taxonomy

(e.g., names of units in the refinery) aids this heuristic. Based on the position of the title, the meta-information and main content of the procedure (containing action knowledge) is located relative to the position of the title by Heuristics 2 and 3.

86 Table 5-5. Heuristics for Phase One: Spurious Content Removal. Heuristic Assumptions and Formalization Heuristic 1 Assumption: The Title contains a Term that describes the process affected. Title Identification  ki  Sn | ki  TitleIndicator  Sn.title = 1 Heuristic 2 Assumption: Meta-information is relative to the position of the Title (see Meta-Info Heuristic 1). Identification Sn| (Si.title = 1)  (Sn  MetaInfoLocate(Si) Sn.meta-info = 1 Heuristic 3 Assumption: The location of the main body of the procedure is relative to Body of Procedure the position of the Title (see Heuristic 1). Identification Sn| (Si.title = 1)  (Sn  MainTextLocate(Si)  Sn.maintext = 1

In the second phase, Heuristics 4–11, the main content of the procedure is pre-processed.

This part of the procedure is conceptualized as a set of instructions, i.e., action knowledge. The pre-processing is accomplished by extending part-of-speech-tagging (POST) (Charniak, 1997) with terms from what is referred to as the lightweight ontology, which reflects domain knowledge along with a dictionary that differentiates actions performed by operators. The heuristics in this phase tag and re-structure each instruction into action components made up of a responsible actor, the resource of the action (objects), a predicate that indicates the action, a goal of the action, and conjunctions followed by phrases or clauses referring to conditions.

87 Table 5-6. Heuristics for Phase Two: Instruction Pre-processing. Heuristic Assumptions and Formalization Assumption 1: Each instruction contains at most one predicate. Heuristic 4 Assumption 2: The predicate denotes the action of this instruction. Predicate Tagging (ki  Sn | POST(ki)  V)  Sn.predicate = ki Assumption 1: The predicate modifier is identified from the taxonomy of Heuristic 5 the adverb or prepositional phrase. Predicate Modifier Assumption 2: Predicate modifiers appear before or after the predicate. Tagging ki  Sn | POST(ki)  MV  Modify(ki) = Sn.Predicate  Sn.PredicateModifier = ki Assumption 1: The subject is identified from the taxonomy of the noun or the subject is derived from phrases and abbreviations. Heuristic 6 Assumption 2: The instructions are positive and the subject is identified Subject Tagging from the taxonomy of the noun, phrases, and abbreviations. ki, kj  Sn | POST(ki)  N  POST(kj)  V  i < j  Sn.Subject = ki Assumption 1: The subject modifier is identified from the taxonomy of the Heuristic 7 adjective or prepositional phrase. Subject Modifier Assumption 2: Subject modifiers appear before or after the subject. Tagging ki  Sn | POST(ki)  MN  Modify(ki) = Sn.Subject  Sn.SubjectModifier = ki Assumption 1: The object is identified from the taxonomy of the noun, phrases, and abbreviations. Heuristic 8 Assumption 2: The instructions are positive and the object follows Object Tagging predicate. ki, kj  Sn | POST(ki)  N  POST(kj)  V  i > j  Sn.Object = ki Assumption 1: The object modifier is identified from the taxonomy of the Heuristic 9 adjective or prepositional phrase. Object Modifier Assumption 2: Object modifiers appear before or after the Object. Tagging ki  Sn | POST(ki)  MN  Modify(ki) = Sn.Object  Sn.ObjectModifier = ki Assumption: The purpose of the action is identified from the taxonomy of Heuristic 10 the preposition “to” that follows the clauses. Purpose Tagging ki  Sn | POST(ki)  P  Sn.Purpose = ki Heuristic 11 Assumption: Multiple instructions are connected via conjunctions. Condition Tagging ki  Sn | POST(ki)  Cond  Sn.Condition = ki

In the third phase, Heuristics 12–15, action knowledge is extracted and chunked. The inputs to heuristics are the pre-processed instructions are the results of Phase Two. These heuristics extract action knowledge attributes from action components annotated and tagged in

88 Phase Two and set up boundaries of action knowledge chunks according to the task analysis strategies reviewed in sub-Section of 5.4.2. The outputs of these heuristics are chunks extracted from the procedure.

Table 5-7. Heuristics for Phase Three: Chunk Extraction. Heuristic Assumptions and Formalization Heuristic 12 Assumption: The event trigger is an instruction with no explicit action Event Trigger for the operator. Identification Sn | Sn.Subject ! OperatorList  Sn.EventTrigger = 1 Assumption 1: A break in activity, e.g., checking/waiting, suggests a boundary. Heuristic 13 Assumption 2: Action before a waiting interruption is the end of the last Time-based chunk. Action after a waiting interruption is the initiation of a new Chunking chunk. Sn | Sn.Condition  WaitingList  Sn. BreakByTiming = 1 Assumption 1: Objects of units in proximity belong to the same chunk. Heuristic 14 Assumption 2: Objects of units in a new location list suggest the Location-based initiation of new chunk. Chunking Sn | Sn.Object  LocationList(i)  Sn-1.Object  LocationList(j)  i != j  Sn.BreakByLocation = 1 Assumption 1: The default operator is defined as a field operator if no specific actor is mentioned in the instruction. Heuristic 15 Assumption 2: A switch of actor suggests a break. Actor-based S | S .Subject = null  S .Subject = Field Operator Chunking n n n Sn | Sn.Subject, Sn-1.Subject  OperatorList  Sn.Subject != Sn-1.Subject  Sn.BreakByActor = 1

Appendix A explains the notations used in the heuristics above.

Extracted action knowledge is stored in a database in the form of chunks. As the number of action knowledge chunks processed and stored increases, it is observed that chunks with similar action knowledge appear many times across different procedures. A new function named

“find commonality” is implemented to identify the commonalities across action knowledge chunks. I define a common action in multiple action knowledge chunks as an instruction with the

89 same actor role, the same action, and the same operating equipment. Multiple common actions could exist across chunks if the actions are common and have the same sequence. Any chunks that share one or multiple common actions are defined as similar chunks. This new function allows a user to select extracted action knowledge chunks from a database in order to compare them. All the selected chunks are compared with every single instruction to identify common actions. To show the similarity among multiple action knowledge chunks, a network display function is implemented to demonstrate the commonality relationship across action knowledge chunks. In the network, every node is an action knowledge chunk. Nodes are connected if they are similar chunks, i.e., if they share one or multiple common actions. The function is shown in

Figure 5-9 (reproduced Figure 5-7).

Figure 5-9. Network display of action knowledge chunks with SPA.

5.4.4 Approach Mutability – Learning Mechanisms

Whereas the proposed heuristic approach provides a general blueprint for a system design, the appearance of a corresponding concrete IT artifact depends on the context in which it

90 is instantiated. In this artifact implementation, I adopt learning mechanisms designed to improve heuristic performance in the second phase of SPA.

Learning is the ability of a system to perform a given task more efficiently the next time

(Simon, 1981). A learning model generally comprises four components: learning mechanisms, a performance element, a critic, and a problem generator (Nielsen, 1998). The learning mechanisms improve the performance element based on prior experiences. The mechanisms predict future rewards by considering a sequence of earlier actions. A critic defines whether to accept the improvements suggested by the learning mechanisms. The prior experiences act as the problem generator (Purao, Storey, & Han, 2003). However, the learning mechanisms cannot be decided on in a straightforward way. Learning occurs by specifying rewards that are tied to an outcome whereby good decisions are reinforced and bad ones are penalized (Kaelbling, Littman, & Moore,

1996). A learning strategy must rely on feedback from the critic at each step and must also maximize rewards. A supervised learning strategy (Briscoe & Caelli, 1996) can be adapted to add a human element to the process.

In the present research study, learning is implemented to allow the SPA learning of tagging probabilities in Phase Two by using untagged instructions for its training data and producing a tag set by induction. To affect learning in this process, the performance element constitutes the existing pre-processing process. The users of the tool act as the critic, and the problem generator serves as the previous applications of extracting and chunking action knowledge from procedure. Learning mechanisms are developed to improve procedure parsing and chunking performance based on previous applications and feedback from the users. Finally, the heuristics of each phase are not completely static, that is, with each procedure they encounter, they update an internal knowledge base by acknowledging the positive and negative feedback provided by users. To incorporate the supervised learning strategy, a subclass of this technique—

91 reinforcement learning—is used as the learning mechanism (Muggleton & de Raedt, 1994). One life cycle of this learning process in Phase Two is shown in Figure 5-10.

Figure 5-10. One life cycle of the learning process for Phase Two.

Figure 5-11 illustrates how the learning mechanism works to improve outputs of Phase 2 when SPA extracts knowledge action from a procedure and chunks it. First, cleaned but untagged instructions are passed through as outputs of Phase 1 after a target procedure has been processed.

Then the Phase 2 heuristics parse and tag these instructions with the tags most likely to be accurate as indicated in the dictionary (problem generator). Once the instructions have been pre- processed, the tool then compares these outputs to the users’ suggestions (critics). Tags reviewed and approved by users for the same set of procedure instructions are used as the reference. Any errors made by the tool (modifications to the tool outputs made by users) are learned in order to change the certain words/phrases and the probability in taxonomy. The updated dictionary will be applied to the future pre-processing of instructions in order to align the outputs more closely with

92 reality. Whether this learning mechanism is effective will be assessed in the next chapter of evaluation.

Figure 5-11. Learning mechanism for Phase 2.

Based on multiple training sessions on the learning mechanism of SPA, an important consideration was identified—the learning scope. During the supervised learning process, I found that the same term could serve as different action components in certain environments. For example, the term “shutdown” could be used as either a noun or a verb in the syntax of natural language. The term is frequently used in operating procedures to denote an action or the object of an action. In an instruction such as “check compressor shutdown,” users view the term

“shutdown” as an object modifier. However, users view the term as an action in another, i.e.,

“shutdown Heater 25” (although in Standard English, “shutdown” written as one work is a noun not a verb). One major impact on the use of action components is the preference of procedure

93 creators. For example, in procedures from two refineries, the term “shutdown” could serve as an action component although the probability of this function varies (see Table 5-9). In Refinery 1,

“shutdown” is mainly used as the object of action, whereas in Refinery 2, it is mainly used to describe the operator’s action. Other areas, on which the use of special terms as certain action components have an impact, include operation equipment targeted by procedures and the locations where the procedures are to be carried out.

Table 5-8. Example of a Tagging Analysis of the Term “Shutdown.” Procedure Appearance Serves as Object Refinery Serves as Action Number Number of Action Refinery 1 85 78 3 75 Refinery 2 16 6 5 1

All these facts including what terminologies are used and how to use them in procedures together impact the learning mechanism. Therefore different “scopes” emerges to select training data for the learning mechanism in Phase 2. To create an effective learning mechanism and achieve accurate pre-processing results, a function is implemented to allow users to select the scope of learning mechanism before Phase 2 processing (see Figure 5-12). The default scope of the learning mechanism is the global one, which is trained by the previous supervised dataset (the pre-processed procedure instructions with user suggestions). Users also have the option of selecting the scope of the learning mechanism in which the dictionary is trained by a supervised dataset of procedures from the same company (refinery), the same operating location, the same equipment unit, or their combinations. I would suggest that the ability to select the scope of the learning mechanism could improve the accuracy of tagging results.

94

Figure 5-12. Scope of learning.

5.4.5 Expository Instantiation

In Section 3 (research method and research design), I described the iterative process and the software artifacts that resulted from each iteration. In the present section, I describe the current (for the purpose of this research, final) form of the prototypical instantiation for the proposed design theory “for the purpose of theory representation and exposition” (Gregor &

Jones, 2007). A SPA software tool implements both the heuristic approach and the learning mechanism. The SPA provides a visual representation of the intermediate and final outputs connected to the three phrases outlined earlier. Figure 5-13 shows the overall architecture of the tool.

95

SPA Server

POST Meta-info & Ontology Procedures Domain Ontology

SPA Database System

Figure 5-13. Architecture of SPA.

Figure 5-14 shows the interfaces of the final outputs—action knowledge chunks— generated by SPA. An example of chunking one procedure and the screenshots of the interface for each processing step (including inputs and outputs) are listed in Appendix B.

Figure 5-14. Interface of SPA in the final phrase.

96 5.5 Conclusions

In this essay, I have proposed a design theory developed via a heuristic approach for extracting and chunking action knowledge from procedures in the petrochemical industry. The approach draws on work related to an action view of knowledge (Bera & Wand, 2009; Newell,

1982; Newell & Simon, 1976) and to learning mechanisms (Muggleton & de Raedt, 1994;

Nielsen, 1998). A tool referred to as SPA was developed to implement the heuristic approach.

The design theory components are listed in Table 5-9.

Table 5-9. Design principles of the developed design theory. Design Theory Operationalized as Component (1) Purpose and The aim of the proposed design theory is to provide prescriptions for an scope approach to extract and chunk action knowledge from procedures. The design is tailored to support both action knowledge chunking and display. (2) Justificatory Design requirements are derived from different aspects, including the knowledge requirements of the industry domain and kernel theories from the field of action knowledge, as well as related literature on human cognition. Based on these requirements triggers that performance clusters (event triggering, time spanning, location changing, and actor switching) are defined. (3) Solution A heuristic approach as used in natural language–processing is adopted. suggestion A network display is used to show the relationships that multiple action knowledge chunks have in common. (4) Approach The dictionary used for instruction pre-processing is domain-dependent and mutability must be tailored to the specific industry context. Learning mechanisms are designed to contribute to the knowledge base as new procedures are encountered and user inputs are recorded. (5) Expository A prototypical instantiation is presented to elucidate the constructs and instantiation functions of the design theory with an example case.

SharePoint, PolicyStat, and OnPolicy are among several procedure management systems that are currently in use in the petrochemical industry. Generally speaking, these procedure management systems treat each procedure as a separate unit to be managed (e.g. as a text file) but

97 without further analysis. For example, SharePoint7 is a web application framework and platform developed by Microsoft (Oleson, 2007) that integrates intranet, content management, and document management (Gilbert, Shegda, Phifer, & Mann, 2009). PolicyStat8 goes one step further to manage the procedure lifecycle, keeping track of procedure creation, procedure modifications, and distributing procedures to users (Hall, 2014). Yet another tool, OnPolicy does allow some extraction of meta-information from procedures to manage it separately (Anderson,

2013). In doing so, it follows a process similar to Phase 1 that I have designed for SPA – spurious content removing and identifying information such as procedure name, date of creation etc. These elements are similar to document meta-data identified elsewhere, and closest in spirit to RDF

(Klyne & Carroll, 2006). None of these, existing, procedure management systems attempt to provide comprehensive analytical capabilities such as the ones I have designed, including heuristics for analyses of procedures (such as Phase 2 and 3 processing) and for extracting action knowledge.

I claim several potential benefits that follow from this design theory with a heuristic base.

First, the approach exposes the fundamental properties of each procedure instruction, such as timing, location, subject (actor), and condition in a way that renders each instruction amenable to manipulation. Second, the chunks extracted using this approach are easier than algorithm approach to manage and reuse across multiple procedures. Third, the availability of extracted action knowledge chunks provides more freedom to tailor the procedure presentation to operators who differ in regard to the level of their respective expertise. Fourth, the extracted action knowledge can be used to drive training efforts for novice operators as well as experts who may make delegation decisions.

7 http://www.convergepoint.com/ 8 http://www.policystat.com/

98 Although there are plenty of benefits, I acknowledge that there are some limitations to this heuristic-based approach. First, the work described in this essay relies on heuristics. It is difficult to claim comprehensiveness of procedure chunking rationales in current heuristics.

Second, the approach requires extra effort to create a domain-specific lightweight ontology from subject-matter experts to support the heuristics. The effectiveness of extracted chunks depends on the quality and veracity of the information in this lightweight ontology. Third, the emphasis on action knowledge structured as predicates, subjects, objects, and conditions may lead operators to ignore descriptive or peripheral knowledge that may be implicitly embedded in instructions.

Fourth, the action knowledge chunks may emphasize the “how” instead of the “why”, and thus tend to slow the development of experience-based expertise. One possible solution could be to supplement the action knowledge chunks with a conceptual map of the refinery and with a rationale of descriptive non-action knowledge in order to enhance the operators’ understanding of the procedures. Finally, the approach of extracting and chunking action knowledge is based on existing procedures. It does not immediately lead to the identification of new procedures.

In spite of the limitations, the proposed approach has considerable potential for advancing the state of the practice. Possible applications can be found in other industries with a large number of procedures, instructions, and working guidelines. Examples include monitoring nuclear plants (Vaishnavi & Kuechler, 2004) and healthcare procedures (Bera & Wand, 2009).

Although prior work has suggested possibilities for automating procedures in process industries, the human-in-the-loop phenomenon is increasingly recognized as critical (Cranor, 2008; Lee &

Hsu, 2003). Improving this aspect of the process industry operations requires that I actively extract, represent, and manage the knowledge embedded in procedures. This research is aimed at achieving the broad goal of managing action knowledge embedded in the procedures. The action knowledge extraction and chunking approach developed in this paper is one element of this

99 overall goal. My hope is that the approach outlined will serve as the basis for further discussion and enhancement.

100

Chapter 6

Essay 2 – A Design Science Evaluation of the Semantic Procedure Analyzer

Chapter Organization

In this essay, I present a comprehensive evaluation of the Semantic Procedure Analyzer

(SPA), the approach to extracting and chunking action knowledge developed in Chapter 5. The essay describes the rationale, the choices made for the evaluation effort, the evaluation procedures, and the findings. Like Essay 1, this essay does not require knowledge of my previous research or of other chapters in the overall study and can, therefore, be read as an independent piece. The chapter, therefore, includes sufficient information about SPA, the approach used herein to extract and chunk action knowledge, to guide the reader. The next section, Précis, serves as an abstract for the essay.

6.0 Précis

As a structural component in research aimed at generating and testing a design theory to address a problem (Gregor & Jones, 2007; Takeda et al., 1990), evaluation has an important role in design science. In this essay, I present my evaluation of SPA. Here, SPA represents a software instantiation that embodies a design theory focused on the management of action knowledge.

More specifically, SPA is implemented as an approach to analyzing and chunking codified action knowledge in operator procedures following a heuristic approach augmented by learning

101

mechanisms. The evaluation effort consists of multiple strategies. First, I demonstrate SPA using a descriptive illustration. Then, I report expert feedback collected for formative and summative evaluation. Finally, I describe the results of this evaluation effort in order to assess the contribution of learning mechanisms to specific SPA phases. The results show that chunking heuristics contribute to the effectiveness of action knowledge chunks and that the learning mechanism contributes to the pre-processing of instructions. I conclude with a discussion of future work.

6.1 Introduction and Motivation

Action knowledge is knowledge that enables individuals to select an action or actions in order to change a current state to a goal state (Bera & Wand, 2009; Blosch, 2001). As such, action knowledge can either be possessed by individuals in a tacit form (Suchman, 1987) or codified as procedures in an explicit form (Woodward, 1965). Procedures represent groups of codified instructions together with the background information needed to perform a specific task

(Woodward, 1965). Procedures are widely used in process industries (such as the petrochemical industry) to shape operating practices and to prevent human error (Attwood & Fennell, 2001;

Wenger, 1998). However, it is difficult to manage such action knowledge (in procedures) because

(1) most procedures contain action knowledge in order to provide instructions to operators together with declarative knowledge such as background information (e.g., a description of the physical layout of a refinery, an explanations for an action, or a chemical composition); and (2) as action knowledge carriers, procedures are created to describe all the operations that must be undertaken in order to complete a task, which may lead to duplicated or superfluous information that could (such information could, in turn, cause problems because the action knowledge is,

102

therefore, not tailored to specific operators and their experience). For example, many duplicate procedure segments can exist across multiple procedures, which may require significant additional effort when changes or updates are called for. In addition, there may be explanations and descriptions that experienced operators can reasonably skip over.

The extraction and chunking of action knowledge (Turner & Engle, 1989) is one possible solution to this problem. This approach may require cleaning the procedures so that no non-action knowledge content remains, generating chunks in a way that avoids duplication, and using these to support the operators in a tailored way. To achieve the goal of extracting and chunking action knowledge, I followed a design science method to develop the proposed approach. Because the codified procedures are written in free text, often by different authors, and without a clear standard across locations, refineries, or organizations, I argue for a heuristics-based approach

(Pearl, 1984) to extracting action knowledge chunks from procedures (He et al., 2011). The foundational perspectives guiding this design theory include component tagging (Nielsen, 1998), chunking action knowledge (Barr et al., 1992), and learning mechanism approaches (Erickson,

2013). By building on these perspectives, I developed a three-phrase approach comprising the following phases: Phase 1: spurious content removal (i.e., non-action knowledge); Phase 2: instruction pre-processing (e.g., the extraction of action components from instructions); and Phase

3: chunk extraction (i.e., the use of heuristics to identify triggers and cluster instructions). Further,

I developed a software tool named the Semantic Procedure Analyzer (SPA) to implement this approach. Figure 6-1 shows the three-phase approach.

103

Identify

Triggers

Spurious Instruction Pre-processing Content Action (augmented Procedures Removing Cluster Knowledge with learning) (Phase 1) Instructions Chunks (Phase 2)

Chunk Extraction (Phase 3)

Figure 6-1. The three-phase approach to extracting and chunking action knowledge.

Figure 6-2 shows screen snapshots from the implementation, corresponding to the three phases. The SPA implementation, thus, realizes a “design theory” (Gregor & Jones, 2007) focused on extracting and chunking action knowledge from operator procedures.

104

Phase 1: Spurious Content Removal

Phase 2: Instruction Pre-processing

Phase 3: Chunk Extraction

Figure 6-2. The SPA Interface in three phases.

Evaluation is an essential element of any design science approach and an essential structural component of any design theory (Gregor & Jones, 2007; Takeda et al., 1990). In this essay, I evaluate the reported design theory for extracting and chunking action knowledge. In particular, I evaluate learning mechanisms aimed at improving instruction pre-processing by assessing the performance of component tagging with and without learning mechanism. I also measure the effectiveness of chunk extraction (based on the heuristics implemented in SPA) by

105

comparing them against the chunks generated by a naïve approach, which simply generates chunks of a fixed size. The objective of the research presented in this essay, therefore, is to investigate the effectiveness of outcomes generated by SPA, a tool that facilitates action knowledge extraction and chunking.

I outline the work completed in the remaining seven sections of this essay. Section 2 describes the evaluation objectives and approaches, including a brief overview of evaluation in design science, and the rationale for the evaluation objectives and approaches selected for this research. Section 3 introduces the inputs to the evaluation process, i.e., the procedures gathered from industry partners, and describes how I selected the procedures for the evaluation effort.

Section 4 describes the evaluation example, which is a procedure-processing demonstration.

Section 5 describes the evaluation by a panel of experts. In Section 6, I evaluate the effectiveness of the learning mechanism in instruction pre-processing (Phase 2 of SPA). Section 7 describes the evaluation of the chunking outcomes suggested by the heuristics used to extract the action knowledge chunks (Phase 3 of SPA). Section 8 summarizes the research reported in this essay together with its implications and discusses future research directions.

6.2 Evaluation Design

Though crucial in the context of design science, evaluation is difficult because the objectives and measures are often not defined adequately at the outset of a design science implementation effort (Keen, 1978). To overcome this problem, I began by defining the evaluation objectives, based on which I then determined an appropriate approach and related strategies. The utility, quality, and efficacy of a designed artifact are all subject to an evaluation effort (Hevner et al., 2004). However, informational and technological artifacts such as SPA that implement a prescriptive design theory (Gregor & Jones, 2007) can be evaluated in terms of a

106

number of criteria that determine the effectiveness of the designed artifact. In this effort, I draw on a perspective that provides a useful hierarchy of objectives (Hamilton & Chervany, 1981) in order to describe the evaluation objectives. This conceptual hierarchy suggests three levels for assessing the effectiveness of the designed artifact: Level 1, which assesses the information provided by the designed artifact to users; Level 2, which assesses the use of the artifact and the effect on user organizational processes and performance; and Level 3, which assesses the effect of the artifact on organizational performance. This broad characterization is useful when considering the levels of evaluation for artifacts that are designed and deployed in organizations.

In the present research, I focus on evaluation at Level 1 to assess information provided by the designed artifact, namely, the chunking of action knowledge generated by SPA.

The selection of Level 1 for the purpose of evaluation is based on the following reasons.

First, Level 1 provides for the most direct evaluation objective rooted in the artifact’s development and operation process (compared to the other levels, both of which represent distal outcomes). Second, for the higher-level objectives, such as user and organizational performance, the artifact must be deployed in the intended environment. I believe that these activities essentially describe a possible future research cycle, such that they are beyond the scope of the research described in this essay. Therefore, the primary evaluation objective is the quality of information provided by the designed artifact, which ensures that the artifact design is feasible, that the initial design objectives can be met, and that the artifact is superior to the existing alternatives (Hevner et al., 2004). Level 1 evaluation, therefore, provides the foundation for later evaluation at the higher levels. Based on the discussion so far, I identified two evaluation objectives. These closely align with the intent of the Level 1 evaluation. Table 6-1 outlines these objectives in the context of the overall approach to extracting and chunking of action knowledge from operational procedures.

107

Table 6-1. Evaluation objectives and approaches. Overall Specific Approach Objective Objective Descriptive illustration (Cleven et al., 2009; Hevner et al., Establish utility 2004) and efficacy Assessment by expert panel Effectiveness Assessment of heuristics for SPA of SPA for – Phase 2: Compare SPA with a learning mechanism with Evaluate users SPA without a learning mechanism formalized – Phase 3: Compare SPA with chunking heuristics and SPA knowledge with a naïve approach (chunking with fixed size of instructions)

My objectives directly reflect prior work related to design science evaluation (Byiers,

Reichle, & Symons, 2012; Manning et al., 2008). More specifically, Venable et al. (Manning et al., 2008) describe several objectives for evaluating design science outcomes, two of which are addressed in the present research: (1) to establish the utility and efficacy of the artifact in regard to achieving the stated purpose and (2) to evaluate formalized knowledge about the designed artifact related to achieving its purpose.

In designing the evaluation effort, therefore, I drew on multiple approaches. First, I used an authentic operating procedure contributed by industry partners to illustrate how SPA can be used to extract and chunk action knowledge. Through this approach, it was possible to establish the feasibility and efficacy of SPA on the basis of a demonstration. Second, I collected and assessed feedback from expert operators pertaining to their exploration experience using SPA.

And, third, I traced the outcomes to the different heuristics and learning mechanisms in SPA in order to evaluate the available formalized knowledge about SPA. The foundation for these evaluation efforts was an ongoing construction of examples based on authentic procedures in order to demonstrate the efficacy of SPA.

To summarize the evaluation design, I added extra layers to Figure 6-1 to show the evaluation efforts under the prior designed action knowledge–chunking approach for procedures

108

(see Figure 6-3). The details of each layer of evaluation, as well as the evaluation results, are discussed in the following sections.

Action Knowledge Extraction Process

Identify

Triggers Instruction Spurious Pre- Content processing Action Procedures Cluster Removing (augmented Knowledge Instructions (Phase 1) with learning) Chunks (Phase 2)

Chunk Evaluation Process

Input Phase 1 Phase 2 Phase 3 Output

Descriptive Illustration (Demonstration): I use an example to demonstrate how a procedure is processed by SPA and finally action knowledge chunks are ultimately generated. Outcomes of each phase are also reported and analyzed.

Expert Assessment: I invite expert operators from industry to experience SPA with the task of extracting action knowledge chunks from procedures. This assessment takes place during the artifact iteration. All feedback from experts is carefully analyzed. Related changes are made on SPA development and later evaluation design.

Assessment on Heuristics:

In Phase 2, I compare the pre- In Phase 3, I compare the chunking processing results from SPA with a results from SPA with heuristics with learning mechanism with those those from SPA with naïve approach without learning mechanism. The whereby chunk action knowledge comparison shows the increasing with fixed size. The comparison accuracy of tags generated by SPA outcome demonstrates higher with a learning mechanism. accuracy of heuristic approach.

Figure 6-3. Evaluation efforts for the pre-designed action knowledge chunking approach.

109

6.3 Operating Procedures as Inputs to Evaluation

A set of procedures contributed by the industry partners provides the input to the evaluation effort. The complete set of operator procedures comprises 194 procedures from six petrochemical plants. Table 6-2 lists the operating procedures obtained.

Table 6-2. Operating procedures obtained from petrochemical plants. Number of Procedures with at Least 4 Plant Location Procedures Instructions A Unknown 44 40 B Unknown 2 2 C Western United States 4 4 L Western United States 127 85 N Eastern United States 6 6 S Central Canada 11 11

An example (see Figure 6-4) describes an operating procedure (set of instructions) for failure recovery at the Hydrocracker unit at Refinery A (anonymous).

110

Figure 6-4. Screenshot of an example procedure provided by an industry partner.

Procedures are often written without following industry-standards or industry-norms

(each plant or organization may have its own standard). It is difficult to convert such operating procedures into plain text and then use them as input for action knowledge chunking in SPA. One important goal of the evaluation effort, therefore, is to convert these authentic operating procedures, obtained from a number of refineries, into input for the evaluation process, and then clean them.

I used these operating procedures in a number of ways. First, I used a subset to generate multiple “descriptive ” (Cleven et al., 2009; Hevner et al., 2004). Next, in order to evaluate the learning mechanism in the Phase 2 evaluation, I used two subsets of procedures: a training set and a testing set. Finally, to evaluate the effectiveness of the chunking heuristics in

111

Phase 3, I used the same testing set as that used in Phase 2, as this set provides error-free inputs at the same level of magnitude as the datasets used for evaluation in prior studies (De, Dhar,

Biswas, & Garain, 2011; Maiti, Garain, Dhar, & De, 2013).

6.4 Descriptive Illustration

The demonstration I report can be described as a “descriptive illustration” (Cleven et al.,

2009; Hevner et al., 2004). The example procedure I use for this demonstration describes a set of instructions for failure recovery at the Hydrocracker unit at Plant A. The procedure is referred to as the DHT Shutdown Procedure – Fuel Gas Failure. It comprises 33 lines of statements. The description and illustrations show the rationale and the outputs obtained by applying the heuristics. Before proceeding to the illustration, I describe the lightweight ontology along with some examples. The lightweight ontology consists of taxonomies of terms, including actions, actors, equipment, and locations obtained from subject-matter experts. Table 6-3 shows the categories with some examples in each.

Table 6-3. Lightweight ontology and example elements. Category Examples Phrases indicating title standing instruction, DHT shutdown procedure, col 13 Predicates indicating actions close, shut down, open, block off, start, steam Subjects indicating actors insider, I, console, console operator, outsider, O, field, field operator Objects indicating equipment feed pump, damper, accumulator, Htr 30, CCU Conjunctions indicating When, till, unless conditions Equipment in area 1 fuel and pilot gas lines, Htr 29, Htr 30, electric pump Equipment in area 2 Platformer, sour water pump

The heuristics in Phase 1 separate the procedure into three parts: meta-information about the procedure, the title of the procedure, and the body of the procedure (see Table 6-4). The first heuristic uses (a) a key phrase indicating “title” and (b) a label-indicating equipment. Together,

112

these allow part of the procedure to be marked as the title. Based on the location of the title (in this procedure, at Line 5 and 6), the location of the meta-information and the main body of the procedure (which contains action knowledge) are determined by Heuristics 2 and 3, respectively, based on the position of the text relative to the title. The body of the procedure comprises 27 lines of statements, i.e., operation instructions, which are subjected to heuristics in the later phases of the evaluation.

Table 6-4. An illustration showing the application of Phase 1 heuristics. Line Body of Procedure Output 1 Reference No.: DDHE – 9 L1.meta-info = 1 2 Page: 1 – 1 L2.meta-info = 1 3 Date: MM/DD/YY L3.meta-info = 1 4 By: xxxxx L4.meta-info = 1 5 STANDING INSTRUCTION NO. DDHE – 9 L5.title = 1 6 DHT SHUTDOWN PROCEDURE – FUEL GAS FAILURE L6.title = 1 7 Fuel gas failure L7.maintext = 1 8 Close main block on fuel and pilot gas lines L8.maintext = 1

In Phase 2, instructions in the body of the procedure, that is Line 7 through 33, are parsed using part-of-speech-tagging (Charniak, 1997). The heuristics identify and tag the following action components in each instruction: predicate, subject, object, condition, etc. The process is augmented with the lightweight ontology described above. Table 6-5 illustrates the outcomes for some of the instructions.

113

Table 6-5. An illustration showing the application of Phase 2 heuristics. Line Body of Procedure Subject Predicate Object Condition 7 Fuel gas failure field null null 8 Close main block on fuel field close fuel and pilot gas and pilot gas lines lines 9 stream to Htr 29 & Htr 30 field steam Htr 29 & Htr 30 fireboxes 10 Open dampers field open dampers 11 Divert stripper bottoms field divert stripper bottoms back to feed 12 Start electric pump field start electric pump 13 Circulate stripper bottoms field circulate stripper bottoms 14 Shut down power recovery field shut down power recovery turbine turbine 15 Shut down feed pump field shut down feed pump 16 If feeding USC, notify the field notify CCU feeding CCU USC 17 Cut out all USC field cut out USC 18 Close annin valve in feed field close feed line line 19 Shut down field feed pump field shut down field feed pump 20 Block off platformer field block off platformer 21 unload compressor make field unload compressor make valves valves 22 Notify the SMR that H2 is field notify SMR no longer needed 23 Shut down condensate field shut down condensate injection injection 24 Shut down sour water field shut down sour water pump pump 25 Close in warm and cold field close warm and cold flash flash accumulators with accumulators normal levels 26 Close in lean and fat DEA field close lean and fat DEA circulation circulation 27 Shut down Htr30 field shut down Htr30 circulation 28 Pump stripper bottoms field pump stripper bottoms back to feed 29 As time permits close field close individual fuel time individual fuel permits 30 close pilot burners field close pilot burners 31 Shut down feed and field shut down feed and product product inhibitor injection inhibitor injection 32 Shut down air fans field shut down air fans 33 If unable to restart unit, field notify Process Manager unable to notify Process Manager restart unit

114

The heuristics in Phase 3 use the structured instructions (see Table 6-7) produced in the previous phase. Heuristic 12, event trigger identification, is triggered by the null predicate in Line

7 (see the first line of Table 6-7). Heuristic 12 sets this instruction as a procedure initiation.

Heuristic 13, time-based chunking, is not triggered because none of the conditions that would indicate a time interruption are found in the instructions. Heuristic 14 is triggered based on information about the locations of the units. The instructions show that the unit changes twice during the procedure. Each change triggers Heuristic 14 to set a procedure break. For example, the target object of Line 8 to 19 appears in the same location of area 1 (obtained from the lightweight ontology). The target object of Line 20 is not co-located with the target object for instructions 8 to 19 (the Platformer is located in area 2). This location change causes Heuristic 14 to set a break between Line 19 and 20. For the same reason, chunk breaks are created between

Line 26 and 27. Heuristic 15 is not triggered either because all the instructions are carried out by the same operator role, the outside field operator. After applying the chunking heuristics, the procedure is decomposed into multiple modules, each defined as a chunk. The action knowledge chunks extracted are shown in Table 6-6.

Table 6-6. Chunks extracted from a procedure. Triggered Chunk Contents Description Heuristic Instruction: Fuel gas Chunk 1: Start Trigger H12 – Event 7 failure Trigger Identification Starting from “Close Chunk 2: 12 instructions H14 – Location- main block on fuel and Primary Operator – Field Operator based Chunking 8 to 19 pilot gas lines” to “Shut Location – Feed Heater and down field feed pump” Reactor Starting from “Block off Chunk 3: 7 instructions H14 – Location- 20 to platformer” to “Close in Primary Operator – Field Operator based Chunking 26 lean and fat DEA Location – Recycle Compressor circulation” Starting from “Shut down Chunk 4: 6 instructions H14 – Location- 27 to Htr30 circulation” to Primary Operator – Field Operator based Chunking 33 “notify Process Manager” Location – Feed Heater and Reactor

115

The illustration provides prima facie validation of the proposed approach by showing the ability of SPA to parse the 33 lines of statements in the procedure in order to extract four action knowledge chunks (see Table 6-6).

6.5 Evaluation by a Panel of Experts

Similar to research and evaluation protocols used in other contexts, a panel of experts was recruited for this evaluation (Cooper et al., 2010). For the specific and specialized tool (SPA) that I designed for this study, the use of a panel of experts constitutes an appropriate approach to evaluation because in the context of a controlled experiment an uninformed pool of subjects would be unlikely to bring the required background knowledge to the task. An evaluation by a panel of experts—although appropriate for overcoming limitations such as a lack of knowledgeable subjects for a lab experiment—can suffer from practical limitations in regard to the number of experts available and willing to participate and the amount of time each expert can give to the project. I recruited five expert operators from Plant N for this assessment. Each operator had at least seven years’ experience in the refinery industry. Table 6-7 shows the background information of the experts who participated in the panel.

Table 6-7. Background of expert operators. Expert A Expert B Expert C Expert D Expert E Average Experience (y) 20 17 7 10 19 14.6 Operation Role Console/Field*

*Console/Field: The operator performs the console and field roles in rotation.

116

6.5.1 Evaluation Process with the Expert Panel

Each expert was invited to use SPA to extract action knowledge chunks from a procedure of their choice. The evaluation session with each expert operator consisted of the following steps.

First, an example of how the SPA tool chunks a procedure was shown to the expert. The features of the tool were explained, and the tool was described as a mechanism for managing existing procedures. The expert then selected an operating procedure that he/she was familiar with to use with SPA. Then the expert loaded and processed the selected operating procedure through the phases in SPA, and finally, extracted action knowledge chunks from the procedure (see Figure 6-

1). For an operating procedure with hundreds of instructions, the entire sequence of tasks, from loading the operating procedure through the multiple phases took a considerable amount of time for each expert (see Table 6-8). The SPA platform was extended to record the process, including the recommendations from the tool and any modifications made by the expert to the outputs of each phase. At the end of process, a debriefing interview with each expert took place.

Table 6-8. Results of expert assessment. Expert A Expert B Expert C Expert D Expert E Average Length of procedure 20 58 26 58 58 44 (line) Evaluation time (min) 68 40 58 61 70 59

6.5.2 Results from the Expert Panel Evaluation

The performance measurement used in this evaluation refers to natural language processing because operation procedures are created with natural language and because part-of- speech tagging plays an important role in processing of this nature. There are many common performance measures for natural language processing, including recall, precision, and the F

117

score (Manning & Schütze, 1999). Recall is the percentage of true instances of C correctly

# 푐표푟푟푒푐푡푙푦 푙푎푏푒푙푒푑 푎푠 퐶 labeled for a specific category C ( ). Precision is the percentage of instances # 푡푟푢푒 푖푛푠푡푎푛푐푒푠 표푓 퐶

# 푐표푟푟푒푐푡푙푦 푙푎푏푒푙푒푑 푎푠 퐶 assigned the label C that are correctly labeled for a specific category C ( ). # 푙푎푏푒푙푒푑 푎푠 퐶

2 ∗ 푅푒푐푎푙푙 ∗ 푃푟푒푐푖푠푖표푛 And, the F score is the harmonic mean of recall and precision ( ). In the present 푅푒푐푎푙푙 + 푃푟푒푐푖푠푖표푛 research, the accuracy of the processing results is calculated as the precision rate:

(1)

This formula is appropriate because the measures of precision concentrate the evaluation on the correct labels by asking what percentage of the relevant items is labeled and how many errors have been returned (Manning et al., 2008). The denominator of the total number of suggestions created by SPA captures both the confirmed and the revised labels, which means that the formula can return accurate results. Given the same denominator, as the numerator of the number of confirmed items increases, so too does the accuracy quotient. In comparison, given the same numerator, as the denominator of the total number of suggested items increases, more errors are contained, such that the accuracy quotient decreases. However, the measurement of recall is not enough to reflect the processing results, as it is trivial to achieve recall of 100% by labeling all the items in response to a specific category.

A summary of the evaluation results is shown in Table 6-9. The results for each expert are listed in the first five columns, with the average shown in the last column.

118

Table 6-9. Results of expert assessment. Processing Results Expert A Expert B Expert C Expert D Expert E Average (Accuracy %) Phase 1 100 100 100 100 100 100 Phase 2 54.2 58.7 48.9 58.7 58.7 55.8 Phase 3 33.3 - 1 66.7 60.0 66.7 56.7

1 Expert B was unable to complete the evaluation in the last phase due to an emergency.

The results show that SPA can successfully identify all the procedure content (Phase 1) with 100% accuracy. In contrast, the accuracy of the results from Phase 2 was about 56%. A more detailed analysis of the results shows that when multiple experts (B, D, and E) selected the same procedure, they found the same errors in the outputs, which suggests a need to add learning mechanisms in order to improve the performance of SPA. The outcomes of Phase 3 tell a slightly different story. Here, the results show that even for the same procedure, the experts sometimes differed in regard to their respective opinions about the best way to chunk action knowledge. For example, for the same procedure, experts B, D, and E did not chunk the action knowledge in the same way. The process started with the action knowledge chunks generated by SPA, which were the same. Experts D and E, however, had differing opinions on these action knowledge chunks

(Expert B was unable to complete this evaluation in Phase 3). Expert D arrived at ten chunk breaks, of which six were suggested by SPA (yielding accuracy of 60%). For the same procedure,

Expert E arrived at twelve action knowledge chunk breaks, of which eight were suggested by

SPA (yielding accuracy of 66.7%). This difference suggests that it is difficult and unnecessary to seek “correct” chunking results. Instead, an alternative measure would be the acceptance of chunks by users, such that a precision rate (Manning & Schütze, 1999) could be employed. This was the measure used to evaluate the chunking results for the final evaluation in Phase 3. The evaluation aimed at Phases 2 and 3, respectively, is described next.

119

6.6 Evaluating the Learning Mechanisms in Phase 2

For this phase, instruction pre-processing, the evaluation effort is aimed at understanding how the learning mechanisms improve the performance of the tagging action components.

6.6.1 Evaluation Process

The metric I use to measure the dependent variable—the accuracy of the action components tagged by SPA—is the precision rate (Manning & Schütze, 1999), which was also used in the previous sub-section with the expert panel. The evaluation is structured as a comparison between (1) a version of SPA enhanced with the learning mechanism, and (2) a version of SPA without the learning mechanism. Using the same software platform for comparison, instead of two tools, contributes to the internal validity of the experimental results

(Müller-Wienbergen, Müller, Seidel, & Becker, 2011).

Prior research suggests two dominant evaluation approaches to human-assessed measurement: user-centered evaluation and expert-based evaluation (Scholtz, 2004). A user- centered method maximizes user involvement but tends to be expensive and time-consuming because of the large number of participants required. In contrast, expert-based methods can be less expensive but require access to key experts (Nielsen, 1994). The second method also provides insights that novice users cannot provide. Following the above insights and examples such as those from other empirical studies of text analysis and knowledge extraction (Savova et al., 2010), I followed an expert-based method, relying on domain experts from the petrochemical industry. This choice has other implications. A limited number of experts can reduce the impact of individual difference on the final results. It is, therefore, important to recruit experts who can strengthen the validity of the evaluation. Following recommendations from prior research that an

120

operator must have at least 7 years’ experience to be considered an expert (Klein & Hoffman,

1993), I recruited experts who fit that criteria for this evaluation.

Evaluation Input

The input to the evaluation effort consisted of 85 authentic procedures contributed by the refineries. The selection of these procedures was dictated by the following rules: (1) the set of procedures must share similar terminology, e.g., based on their origin from the same refinery or organization, and (2) the procedures must be sufficiently long such that they provide sufficient action knowledge content for the evaluation. The first criterion was important because procedure writers (without clear and enforced standards) might otherwise use multiple terms to describe the same entity or phenomenon. For example, “Heaters 29, 30,” “Heaters 29 and 30,” “Htrs. 29 &

30,” and “Heater 29 and Heater 30” could all be terms used to refer to Heaters 29 and 30. In another example, terms such as “water pump,” “the red pump,” and “S1305” could all be used to mean the same object, i.e., a pump that controls water in a specific situation. Without knowledge of these contractions, substitutions, or synonyms, the process of tagging action components can be very difficult (Wu & Weld, 2010). A second input into the evaluation effort, therefore, was an initial taxonomy of these terms obtained from domain experts.

The final reason for selecting 85 procedures as inputs into the evaluation process was that a large number of training cases is required to assess the performance of learning .

Several studies (Freund & Schapire, 1996; Pang, Lee, & Vaithyanathan, 2002; Puppe, 1998) offer guidelines outlining the amount of data needed, which is often near 100 instances depending on the context. Following these recommendations, I used 85 procedures that qualified (based on the criteria above) for the evaluation effort. This total set of 85 procedures was separated into two subsets for evaluation. The first was a training set (50 procedures). This set of procedures was

121

used to train the learning mechanism in SPA. The second set (35 procedures) was then used to evaluate the performance of SPA. This decision to define the training set as 50 procedures and the testing set as 35 procedures aligned with suggestions in previous research (Bramer, 2007), which argues for a random division into two sets in proportions such as 1:1, 2:1, 70:30, or 60:40. In our case, the ratio of the number of procedures in the training set versus the testing set was 50 to 35, i.e., 59% for training and 41% for testing.

Evaluation Procedure

The evaluation procedure consisted of the following steps. First, I loaded the 50 procedures in the training set into SPA. As SPA automatically tagged each operating procedure, I examined the tags generated by SPA and corrected any errors and refined any suggestions from

SPA. The learning mechanisms in SPA recorded these modifications so that they could influence the performance of the tagging mechanisms in later cycles. The learning mechanisms used, thus, constituted supervised learning (Nielsen, 1998) and reinforcement learning (Muggleton & de

Raedt, 1994). In other words, when SPA made a suggestion to tag a term with a certain category

(e.g., the term “close” tagged as “Predicate”) and that suggestion was not challenged by the user, the mapping between the term and the category was reinforced. When SPA made a suggestion to tag a term (e.g., the term “pump” was tagged as “Verb”), which was then modified by the user

(e.g., the user changed the “Verb” categorization of the term “pump” to the “Object” categorization) the information provided a supervised learning opportunity. In both cases, I relied on learning algorithms and parameters from the Stanford Log-linear Part-Of-Speech Tagger

(Stanford POS Tagger), a software tool that reads text and assigns a part of speech (e.g., noun, verb, and adjective) to each word (Toutanova & Manning, 2000). The default thresholds from the

122

tool were retained in order to allow the classification of a term to be changed following modifications suggested by the user.

A testing session followed with the help of a domain expert. This started with introducing

SPA to the domain expert and then allowing her an opportunity to familiarize herself with and practice the SPA tool. During the session, she processed 35 procedures (in the testing set) using

SPA. This session comprised Phase 1: knowledge separation and Phase 2: instruction pre- processing. It is important to note that as each operating procedure in this set was processed, it contributed to the learning mechanisms in SPA. For example, any errors that were observed and corrected by the user were allowed to influence both reinforcement and supervised learning. This meant that the 51st procedure benefited from the learning that had occurred in the 50 procedures prior, the 52nd procedure benefited from the learning that had occurred in the previous 51 procedures, and so on. The decision to continue the training in this manner was made for two reasons. First, I used all 85 procedures both with and without the learning mechanisms. The cumulative results, therefore, could still be compared in order to evaluate the contributions of the learning mechanisms. Second, although a controlled stop to learning (e.g., learning occurring with the first 50 procedures but not with the next 35) would have allowed me to infer what had been learned with just the training set, it would have reflected the anticipated usage mode (where each procedure would contribute to learning). Further, as the procedures were varied, it could be argued that across some boundaries, such learning would, in fact, reduce performance. These were some of the considerations that led to the decision to conduct the “testing session” as a natural continuation of the “training session.”

The SPA tool has a “save” function that generates a well-formed XML file of procedure- processing results. The domain expert could save the SPA-generated output in Phase 2 as an

XML file by clicking the “save” bottom. And, then, she started examining those action component tags and finally clicked the “save” button again. After the domain expert had

123

processed all 35 procedures, I manually compared the outputs from the two XML files, the file saved before the expert’s modification and the file saved after the expert’s modification in Phase

2. Tags that were the same in the two XML files were recognized as “tags processed by SPA without user modification,” whereas all the tags saved in the first XML file, i.e., before the expert’s modification, were recognized as “tags suggested by SPA.” The evaluation procedure is shown in Figure 6-5.

Training Testing

Trained Input 50 SOP SPA 35 SOP

Phase 1 Main Main Content Content

Action Action Phase 2 Components Components

Figure 6-5. Evaluation a procedure of SPA with learning mechanisms.

Next, the evaluation procedure was repeated with the same set of procedures for SPA without learning mechanisms.

6.6.2 Analysis of Results

The data obtained was analyzed to assess accuracy analysis and to identify and understand any errors that remained. The accuracy of Phase 2 was computed as the recall of action components tagged by SPA (Manning & Schütze, 1999), which is the same formula used

124

in Phase 2 of the Expert Panel Evaluation. The accuracy rate was computed as correct action components tagged by SPA in the total number of action components, with the following formula:

(2)

In order to establish the effectiveness of the learning mechanisms, the above metric was used in a comparison of the performance of SPA with learning mechanisms with the performance of SPA without learning mechanisms. The errors that remained in Phase 2 (instruction pre- processing) were analyzed by referring to prior research, in which two types of errors generated in natural language processing are identified: (a) the decomposition of a sentence into incorrect phrases and (2) the assignment of an incorrect tag to a phrase (Manning, 1999). The results, a comparison of the error rate of SPA with learning mechanisms with that of SPA without learning mechanisms, together with the types of errors identified are discussed next.

To ensure that the results would be reported in a cumulative way, I coined the phrase

“Cumulative Accuracy” as a measure of the cumulative results from the first procedure to the current procedure. For example, the cumulative accuracy for the first 10 procedures would constitute the accuracy achieved by the first 10 procedures taken in their entirety. I used the following formula for this computation:

(3)

125

In this formulation, i denotes the current i-th procedure.

As an example, in the first training session, which consisted of five procedures, 224 action components were generated by SPA. Of these, 124 action components survived (and were retained as suggested by SPA) without any modification. The remaining 100 action components were either modified or removed by the user. The cumulative accuracy of the first five procedures in the training set was, therefore, computed at 55.36%. This computation continued as I calculated the cumulative accuracy for the procedures through the training and testing sessions.

I also computed a measure of incremental accuracy, which enabled me to observe the effect of learning from prior learning for the new set of procedures being processed. As an example, the incremental accuracy for procedures 36–40 was computed following the same mechanisms (the number of correct action components correctly identified and tagged by SPA as a fraction of the total number of action components identified and tagged by SPA). However, this was only done for those five procedures, such that the cumulative accuracy counts for the 35 procedures processed were disregarded. In doing so, I was able to isolate the increasingly important effects of learning mechanisms on the later procedures.

Based on the arguments provided earlier (because the learning continued throughout the training and testing sessions), it could be argued that there is no clear separation between the two sets. Although this is true, I argue that that this lack of a clear separation accurately reflects the anticipated usage mode. To ensure that the results provide a window into the incremental learning with the addition of each step (a 5-procedure cluster), the results given next show both the total cumulative accuracy and the accuracy achieved for each incremental set of five procedures (see

Tables 6-10A, 6-10B, 6-10C, and 6-10D).

126

Table 6-10A. Accuracy obtained with SPA without learning mechanisms (50 procedures). Evaluation Procedure Incremental Procedure Cumulative Process Set Accuracy Set Accuracy 1–5 55.36% 1–5 55.36% 6–10 55.36% 1–10 55.36% 11–15 55.56% 1–15 55.46% 16–20 60.56% 1–20 56.13% 21–25 66.38% 1–25 57.21% Training 26–30 56.80% 1–30 57.17% 31–35 59.30% 1–35 57.42% 35–40 61.13% 1–40 57.63% 41–45 52.47% 1–45 57.32% 46–50 52.35% 1–50 57.17%

Table 6-10B. Accuracy obtained with SPA with learning mechanisms (50 procedures). Evaluation Procedure Incremental Procedure Cumulative Process Set Accuracy Set Accuracy 1–5 55.36% 1–5 55.36% 6–10 55.36% 1–10 55.36% 11–15 56.82% 1–15 56.31% 16–20 65.67% 1–20 57.82% 21–25 76.91% 1–25 61.72% Training 26–30 79.81% 1–30 63.48% 31–35 79.73% 1–35 65.37% 35–40 68.38% 1–40 65.52% 41–45 70.44% 1–45 65.82% 46–50 84.96% 1–50 66.41%

Table 6-10C. Accuracy obtained with SPA without learning mechanisms (85 procedures). Evaluation Procedure Incremental Procedure Cumulative Process Set Accuracy Set Accuracy Training 46–50 52.35% 1–50 57.17% 51–55 68.94% 1–55 57.94% 56–60 56.98% 1–60 57.83% 61–65 54.87% 1–65 57.52% Testing 66–70 57.05% 1–70 57.47% 71–75 57.04% 1–75 57.43% 76–80 56.11% 1–80 57.27% 81–85 52.08% 1–85 57.01%

127

Table 6-10D. Accuracy obtained with SPA with learning mechanisms (85 procedures). Evaluation Procedure Incremental Procedure Cumulative Process Set Accuracy Set Accuracy Training 46–50 84.96% 1–50 66.41% 51–55 74.98% 1–55 66.88% 56–60 79.43% 1–60 68.32% 61–65 78.09% 1–65 69.34% Testing 66–70 85.59% 1–70 71.07% 71–75 68.77% 1–75 70.85% 76–80 73.11% 1–80 71.12% 81–85 71.94% 1–85 71.17%

A comparison of the accuracy of the action component tags produced by SPA with learning mechanisms and those produced by SPA without learning mechanisms is shown in

Figure 6-6.

100.00%

80.00% Incremental - with learning 60.00% Cumulative - with learning Incremental - Accuracy 40.00% without learning Cumulative - 20.00% without learning

0.00% 5 10 15 20 25 30 35 40 45 50 Number of Procedures

Figure 6-6A. A comparison of the accuracy rate of SPA with learning mechanisms and SPA without learning mechanisms for training data.

128

100.00%

80.00% Incremental - with learning 60.00% Cumulative - with learning Incremental - Accuracy 40.00% without learning Cumulative - 20.00% without learning

0.00% 50 55 60 65 70 75 80 85 Number of Procedures

Figure 6-6B. A comparison of the accuracy rate of SPA with learning mechanisms and SPA without learning mechanisms for testing data.

The cumulative accuracy of SPA without learning mechanisms (see Figures 6-6A and 6-

6B) is about 57%, which is quite stable. This result indicates that the tool achieves an average accuracy of 57% without any training. However, the cumulative accuracy of SPA with learning mechanisms is 55% to 71%, which is significantly higher than the cumulative accuracy when a learning mechanism is not used. This improvement is the result of including the learning mechanisms with the SPA tool. With learning mechanism, SPA was able to modify the Stanford

Part of Speech Tagger in the database (the probability of a specific tag for a specific word and the relative probability of the current sequence of tags in English9) and tag the later procedures with the updated ontology. Therefore, it achieved a higher level of accuracy than was possible with

SPA without learning mechanisms.

The comparison between the two lines reflects the effectiveness of the learning mechanism in regard to increasing the accuracy of SPA for action component tagging. However, for many of the learning systems, it is impossible to achieve 100% accuracy. This is because the

9 http://cs.stanford.edu/people/eroberts/courses/soco/projects/2004-05/nlp/techniques_speech.html

129

learning algorithms and implementations try to achieve human-level performance on certain tasks, even though there are always disagreements between humans.10 The maximum degree to which two humans can agree is called the ceiling (Menzies et al., 2008), and it is also the best performance (e.g., the tool will achieve a maximum accuracy of 57% without a learning mechanism). In the petrochemical industry, two or more expert operators may never 100% agree with each other in regard to determining the most accurate action component tags. For the same reason, the performance of SPA with learning mechanisms though it does approach the ceiling will remain below it. Based on the set of procedures provided by the refineries, I found that the performance of SPA improved significantly based on tagging accuracy increasing from 56% to

71%. Further, I argue that an examination of the incremental accuracy of SPA without learning mechanisms suggests that there may not be much more to gain in terms of performance improvement. However, it would be difficult to subject that statement to empirical investigation.

Such analyses belong in the context of field studies that are beyond the scope of the present study.

6.6.3 Error Analysis

In addition to an evaluation of the incremental and cumulative performance of SPA on the basis of learning mechanisms, I also analyzed the specific errors identified. Following prior research, I separated the errors into two categories: (a) the decomposition of a sentence into incorrect phrases and (2) the assignment of an incorrect tag to a phrase (Manning, 1999). I refer to these as parsing errors and tagging errors, respectively.

Parsing errors are generated when the tool incorrectly separates words in a sentence such that incorrect phrases are the result (Tsuruoka et al., 2005). This category of errors occurs at the initial pre-processing stage during which sentences are separated into single words or atomic

10 http://tryolabs.com/Blog/2012/07/04/thoughts-when-considering-machine-learning-project/

130

multiword phrases. An example of a parsing error is a phrase extracted from an original SOP sentence as “verify compressors C-101A stopped” when the correct phrasing separation is

“verify” and “compressors C-101A stopped.” A tagging error refers to instances in which a phrase is correctly separated but tagged with an incorrect annotation (Toutanova & Manning,

2000; Tsuruoka et al., 2005). For example, in the evaluation, for the sentence “Verify steam flow through FV1102,” the noun phrase “steam flow” was mistakenly tagged as a verb. I calculated the parsing and tagging error rates by using the following formulas, which are the cumulative results from the first to the current procedure:

퐶푢푚푢푙푎푡𝑖푣푒 푅푎푡푒 표푓 푃푎푟푠𝑖푛푔 퐸푟푟표푟 (𝑖)

푁푢푚푏푒푟 표푓 푡푎푔푠 푠푢푔푔푒푠푡푒푑 푏푦 푆푃퐴 푤𝑖푡ℎ 푝푎푟푠𝑖푛푔 푒푟푟표푟 푓푟표푚 푃푟표푐푒푑푢푟푒 1 푡표 푃푟표푐푒푑푢푟푒 𝑖 = 푇표푡푎푙 푛푢푚푏푒푟 표푓 푡푎푔푠 푠푢푔푔푒푠푡푒푑 푏푦 푆푃퐴 푓푟표푚 푃푟표푐푒푑푢푟푒 1 푡표 푃푟표푐푒푑푢푟푒 𝑖

× 100% (4)

퐶푢푚푢푙푎푡𝑖푣푒 푅푎푡푒 표푓 푇푎푔푔𝑖푛푔 퐸푟푟표푟 (𝑖)

푁푢푚푏푒푟 표푓 푡푎푔푠 푠푢푔푔푒푠푡푒푑 푏푦 푆푃퐴 푤𝑖푡ℎ 푡푎푔푔𝑖푛푔 푒푟푟표푟 푓푟표푚 푃푟표푐푒푑푢푟푒 1 푡표 푃푟표푐푒푑푢푟푒 𝑖 = 푇표푡푎푙 푛푢푚푏푒푟 표푓 푡푎푔푠 푠푢푔푔푒푠푡푒푑 푏푦 푆푃퐴 푓푟표푚 푃푟표푐푒푑푢푟푒 1 푡표 푃푟표푐푒푑푢푟푒 𝑖

× 100% (5)

In the formula, i denotes the current i-th procedure.

Then the parsing error and tagging error rates generated by SPA with learning mechanisms and by SPA without learning mechanisms were reported and analyzed. First, I list the parsing error rates generated by SPA with learning mechanisms and by SPA without learning mechanisms (see Tables 6-11A, 6-11B, 6-11C, and 6-11D).

131

Table 6-11A. Parsing errors generated by SPA without learning mechanisms (50 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set 1–5 40.63% 1–5 40.63% 6–10 39.30% 1–10 39.63% 11–15 35.82% 1–15 38.30% 16–20 34.25% 1–20 35.76% 21–25 29.36% 1–25 34.90% Training 26–30 27.69% 1–30 34.69% 31–35 25.58% 1–35 34.50% 35–40 24.26% 1–40 34.46% 41–45 41.58% 1–45 34.81% 46–50 32.14% 1–50 35.04%

Table 6-11B. Parsing errors generated by SPA with learning mechanisms (50 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set 1–5 40.63% 1–5 40.63% 6–10 39.30% 1–10 39.63% 11–15 35.34% 1–15 37.92% 16–20 31.71% 1–20 36.21% 21–25 19.61% 1–25 34.71% Training 26–30 16.35% 1–30 32.93% 31–35 17.72% 1–35 31.16% 35–40 25.64% 1–40 30.92% 41–45 22.18% 1–45 30.39% 46–50 13.86% 1–50 29.89%

Table 6-11C. Parsing errors generated by SPA without learning mechanisms (35 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set Training 46–50 32.14% 1–50 35.04% 51–55 29.61% 1–55 34.24% 56–60 34.96% 1–60 34.33% 61–65 36.93% 1–65 33.97% Testing 66–70 32.86% 1–70 33.80% 71–75 33.53% 1–75 33.62% 76–80 36.91% 1–80 33.42% 81–85 32.64% 1–85 33.63%

132

Table 6-11D. Parsing errors generated by SPA with learning mechanisms (35 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set Training 46–50 13.86% 1–50 29.89% 51–55 22.95% 1–55 29.39% 56–60 19.28% 1–60 28.23% 61–65 18.84% 1–65 27.25% Testing 66–70 13.30% 1–70 25.77% 71–75 25.48% 1–75 25.74% 76–80 22.87% 1–80 25.40% 81–85 21.62% 1–85 25.21%

A comparison of the recall rates of parsing errors created by SPA with learning mechanisms and by SPA without learning mechanisms is shown in Figure 6-7.

50.00%

40.00% Incremental - with learning 30.00% Cumulative - with learning 20.00% Incremental - without learning

Rateof Parsing Errors Cumulative - 10.00% without learning

0.00% 5 10 15 20 25 30 35 40 45 50 Number of Procedures

Figure 6-7A. A comparison of the parsing error rates of SPA with learning mechanisms and SPA without learning mechanisms for training data.

133

40.00%

30.00% Incremental - with learning Cumulative - with 20.00% learning Incremental - without learning Cumulative -

Rateof Parsing Errors 10.00% without learning

0.00% 50 55 60 65 70 75 80 85 Number of Procedures

Figure 6-7B. A comparison of the parsing error rates of SPA with learning mechanisms and SPA without learning mechanisms for testing data.

Next, the incremental and cumulative tagging errors of SPA with learning mechanisms and SPA without learning mechanisms are reported and analyzed in Tables 6-12A, 6-12B, 6-12C, and 6-12D.

Table 6-12A. Tagging errors generated by SPA without learning mechanisms (50 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set 1–5 4.01% 1–5 4.01% 6–10 5.34% 1–10 5.01% 11–15 8.62% 1–15 6.24% 16–20 5.19% 1–20 8.11% 21–25 4.26% 1–25 7.89% Training 26–30 15.51% 1–30 8.14% 31–35 15.12% 1–35 8.08% 35–40 14.61% 1–40 7.91% 41–45 5.95% 1–45 7.87% 46–50 15.51% 1–50 7.79%

134

Table 6-12B. Tagging errors generated by SPA with learning mechanisms (50 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set 1–5 4.01% 1–5 4.01% 6–10 5.34% 1–10 5.01% 11–15 7.84% 1–15 5.77% 16–20 2.62% 1–20 5.97% 21–25 3.48% 1–25 3.57% Training 26–30 3.84% 1–30 3.59% 31–35 2.55% 1–35 3.47% 35–40 5.98% 1–40 3.56% 41–45 7.38% 1–45 3.79% 46–50 1.18% 1–50 3.70%

Table 6-12C. Tagging errors generated by SPA without learning mechanisms (35 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set Training 46–50 15.51% 1–50 7.79% 51–55 1.45% 1–55 7.82% 56–60 8.06% 1–60 7.84% 61–65 8.20% 1–65 8.51% Testing 66–70 10.09% 1–70 8.73% 71–75 9.43% 1–75 8.95% 76–80 6.98% 1–80 9.31% 81–85 15.28% 1–85 9.36%

Table 6-12D. Tagging errors generated by SPA with learning mechanisms (35 procedures). Evaluation Procedure Procedure Incremental Rate Cumulative Rate Process Set Set Training 46–50 1.18% 1–50 3.70% 51–55 2.07% 1–55 3.73% 56–60 1.29% 1–60 3.45% 61–65 3.07% 1–65 3.41% Testing 66–70 1.11% 1–70 3.16% 71–75 5.75% 1–75 3.41% 76–80 4.02% 1–80 3.48% 81–85 6.44% 1–85 3.62%

135

Similar to the analysis of the accuracy of SPA and the parsing error rate, a comparison of the recall rates of the tagging errors created by SPA with learning mechanisms and by SPA without learning mechanisms is shown in Figure 6-8.

20.00%

15.00% Incremental - with learning 10.00% Cumulative - with learning Incremental - 5.00% without learning

Rate of Tagging Rateof Tagging Errors Cumulative - 0.00% without learning 5 10 15 20 25 30 35 40 45 50 Number of Procedures

Figure 6-8A. A comparison of tagging errors of SPA with learning mechanisms and SPA without learning mechanisms for training data.

20.00%

15.00% Incremental - with learning Cumulative - with 10.00% learning Incremental - 5.00% without learning

Cumulative - Rate of Tagging Rateof Tagging Errors without learning 0.00% 50 55 60 65 70 75 80 85 Number of Procedures

Figure 6-8B. A comparison of tagging errors of SPA with learning mechanisms and SPA without learning mechanisms for testing data.

There are several reasons for these two categories of errors. One reason is the lack of correct part-of-speech tagging for unknown terms, i.e., the tagged phrases such as the various abbreviations and proper nouns used in the domain are not included in the existing lexicon

136

(Toutanova & Manning, 2000). Enlarging the lightweight ontology with more comprehensive domain knowledge would reduce the errors caused by this reason. The second cause of errors is that some of the phrases refer to several possible part-of-speech tags. For example, words such as

“steam,” “pump,” and “flow” can be tagged either as nouns or verbs. To distinguish such words/phrases, semantic intuition is required (Baker, 1989). There are no good syntactic cues for the tool to draw on in order to correctly tag such words and phrases. However, human taggers infrequently make such errors, as they can perform a semantic analysis by connecting the word or phrase to its context (Baker, 1989). Therefore, errors of this kind could be eliminated if more procedures were used for the training process. In addition, the learning mechanism could be trained by human taggers’ results to increase accuracy. For example, the data in Figure 6-7 reveal such an improvement in terms of reducing parsing errors. The figure shows that the cumulative parsing error rate drops from the 41% yielded by SPA without learning mechanisms to 25% yielded by SPA with learning mechanisms. And, it is always below the line of the cumulative rate of SPA without learning mechanisms. In Figure 6-8, the line of the cumulative rates for tagging errors of SPA with learning mechanisms is also under the line for SPA without learning mechanisms, which demonstrates the effectiveness of the learning mechanisms.

6.7 Evaluation of Chunking Outcomes Suggested by the Heuristics in Phase 3

For this phase, the evaluation is aimed at understanding the acceptance of chunking outcomes generated by the heuristics in Phase 3 of the SPA methodology.

137

6.7.1 Evaluation Process

The third phase in the SPA methodology consists of several heuristics, for chunking based on event trigger, actor switch, and time span. In this phase, I anticipate that by adopting these heuristics, the chunks suggested by SPA will have higher acceptance rates by users than those suggested by SPA without heuristics. As I argued earlier, it is not possible to derive what may be called the best possible chunks nor can I expect the chunks generated by SPA to be identical to those that experts might generate. In fact, as I have seen, experts differ in regard to how they chunk action knowledge. This is the rationale for designing the evaluation efforts for

Phase 3. I compare the results obtained by applying the heuristic-based chunking process that is part of SPA against a naïve chunking approach that fixes the size of the action knowledge chunks.

Similar to the challenges of the evaluation in Phase 2, the quality of the action knowledge chunks cannot be objectively measured. Instead, I rely on the acceptance of chunks from domain experts as the arbiter of quality. The important metric for evaluating the results from this phase is, therefore, the acceptance of action knowledge chunks.

To carry out this evaluation, two versions of SPA were constructed and deployed that differed in regard to chunking mechanism: one version was designed to implement the full set of chunking heuristics for extracting action knowledge as described above, and the other version was designed to chunk procedures following a naïve approach, i.e., such that the chunks would be generated according to a fixed size. Using the same platform for comparison in this way contributed to the internal validity of the experimental results (Müller-Wienbergen et al., 2011) because it allows factors related to the software, such as , performance of the tool, to be ruled out.

138

Evaluation Input

The input to the evaluation effort consisted of 35 authentic procedures contributed by the refineries. The selection of these procedures was dictated by the results of the prior evaluation of the learning mechanisms. As stated in Essay 1, chunking heuristics emphasize the stimulus and interactions between individuals and the environment in terms of three aspects: events that occurred within the environment (Wickens & Hollands, 2000), changes to and timing elements of the environment (Endsley & Rodgers, 1994; Wouters et al., 2008), and communications between multiple actors (Caldwell & Garrett, 2007). These stimuli and interactions are extracted as action components in the instruction pre-processing phase. To evaluate the effectiveness of the chunking heuristics, accurately tagged action components should be provided to ensure that all the stimuli and interactions between individuals and environment are extracted from the procedures. And then these stimuli and interactions trigger heuristics to generate action knowledge chunks, which are the focus of this evaluation. I decided, therefore, to use the same set of 35 procedures to evaluate the chunking outcomes suggested by the heuristics in Phase 3 of the chunk extraction.

More specifically, the action components of the set of 35 procedures as inspected and saved by the expert operator were used as the inputs for this evaluation.

Evaluation Procedure

The evaluation was conducted as two experiments. In the first experiment, SPA extracted and chunked explicit action knowledge with the heuristics proposed in Chapter 5. In comparison, in the second experiment, SPA processed procedures following a naïve approach, which generated chunks according to a fixed size. A comparison of the user acceptance rates of

139

outcomes suggested by SPA with these two approaches—the chunking heuristics approach and the naïve approach—reveals the effectiveness of the chunking heuristics.

In the first experiment, the evaluation procedure consisted of the following steps: First, a domain expert was recruited to participate in the evaluation. There are two reasons for selecting only one domain expert (evaluation participant): (1) A large time commitment was required in order to organize the evaluation process and to collect the necessary data of action knowledge chunks extracted from 35 authentic procedures. Scheduling meetings to accommodate multiple operators’ working schedules was challenging. Coordinating the communications and evaluation efforts among multiple operators would also have been difficult. A small group of evaluation participants (even only one) was preferred because they could be expected to be more flexible and more applicable to real-life conditions, particularly when the participating in experiments or the timing of responses is a primary question (Byiers et al., 2012). And (2) the evaluation by a panel of experts in Section 6.5 discloses that individual differences play an important role in the procedure-chunking process. Operators who differ from each other in regard to experience and/or background may also have differing opinions about the best way to chunk action knowledge. To remove the influence of individual difference, therefore, I decided to use only one domain expert in this evaluation (Connell & Thompson, 1986). Ultimately, then, only one domain expert assumed the task of evaluating the chunking outcomes suggested by the chunking heuristics of

SPA.

Second, the domain expert loaded the 35 pre-processed procedures into SPA. More specifically, the database storing the outcomes of the action components from the testing set of 35 procedures examined and saved by users in the prior evaluation were loaded. For each operating procedure, the domain expert selected chunking heuristics to apply from three aspects: events that occurred within the environment (event trigger), changes to and timing elements of the environment (time span), and communications between multiple actors (actor switch). As SPA

140

automatically broke the operating procedures into action knowledge chunks, the domain expert examined these chunks and refined the suggestions from SPA. Modifications made by the domain expert included removing chunking boundaries across chunks (in order to delete procedure chunks), moving them up or down (in order to change the procedure chunks), adding new boundaries between instructions (in order to create new procedure chunks), and accepting the chunk boundaries as they are. These modifications were recorded by SPA for later data analysis.

In the second experiment, the same evaluation procedure was conducted with the use of

SPA with a naïve approach to action knowledge chunking. In this experiment, SPA with a naïve approach chunked action knowledge at every four lines of instruction. There were two reasons for selecting for lines as the fixed size: first, an individual’s working memory can usually span from three to seven elements of action (Turner & Engle, 1989). I, therefore, planned to select a number between three and seven as the fixed size of chunks. Second, the average size of the action knowledge chunks is 4.4 instructions per chunk according to expert panel, which is close to 4.

Therefore, I defined the number for the fixed chunking size as four for the naïve approach.

6.7.2 Analysis of Errors

The outcomes of the chunking heuristics in Phase 3 in this evaluation were measured as the domain expert’s acceptance rate of the chunks suggested by SPA. As mentioned in Section

6.4, operators with various backgrounds, in different situations, may generate different action knowledge chunks from the same procedure. Therefore, even for the same procedures, operators may have differ in regard to what they think is the best way to chunk instructions. Therefore, in this evaluation, the acceptance of action knowledge chunks suggested by SPA was concerned instead with the accuracy of the chunking suggestions. Given that the chunking heuristics of SPA suggests chunk boundaries, which divide entire procedures into action knowledge chunks, the

141

chunk boundary was used for the acceptance rate measurement. The formula used to calculate the acceptance rate is as follows (Manning & Schütze, 1999):

(6)

For each procedure, I defined the “accepted chunk boundary” as referring to the chunk boundary suggested by SPA and then confirmed by the domain expert without any modifications.

Comparisons of chunk boundaries from the two experiments, one suggested by SPA with chunking heuristics, and the other suggested by the naïve approach with fixed size of chunks, are reported in Table 6-13.

Table 6-13. Outcomes suggested by SPA with chunking heuristics and by SPA with the naïve approach. Chunking Heuristics Naïve Approach Total number of chunks 219 189 Av. size of chunk (# of Line) 3.2 3.7 Max. size of chunk (# of Line) 20 4 Min. size of chunk (# of Line) 1 1

For the total of 35 authentic procedures with 700 lines of operation instructions processed in this evaluation, 219 action knowledge chunks were extracted from the procedures by SPA with chunking heuristics. The average size of the chunks suggested by the heuristics is 3.2 instructions.

And, the size of the chunks ranges from 1 to 20 instructions. In comparison, 189 action knowledge chunks were extracted from the same set of 35 procedures by SPA with the naïve approach. The average size is 3.7 instructions per chunk. As the naïve approach generated one action knowledge chunk for every four instructions, the maximum size of the chunks is 4 and the minimum size is 1.

142

Then the action knowledge chunks suggested by SPA and the later modification made by the domain expert were analyzed for each experiment. Following the above formula, the acceptance rates of the chunks suggested by the two versions of SPA are calculated and reported in Table 6-14. The results show that the action knowledge chunks suggested by the chunking heuristics (54.4%) have a higher acceptance rate by operators than those suggested by the naïve approach (44.2%).

Table 6-14. Acceptance rates of chunks suggested by SPA with chunking heuristics and by SPA using the naïve approach. Chunking Heuristics Naïve Approach Accepted Chunk Boundaries 100 68 Acceptance Rate 54.4% 44.2%

Three chunking heuristics—event trigger, time span, and actor switch—were implemented during SPA development. Although the chunking heuristics suggested action knowledge chunks with a higher user acceptance rate than those suggested by the naïve approach, the performance of each chunking heuristic may vary. To understand the effectiveness of each single chunking heuristic, following the comparison between outcomes suggested by SPA with the chunking heuristics and by SPA with the naïve approach, the performance of each chunking heuristic is analyzed and reported in Table 6-15.

143

Table 6-15. Outcomes suggested by SPA with chunking heuristics. Action Knowledge Chunk Accepted by Suggested by SPA Acceptance Rate Boundary Subjects Total number 184 100 54.4% (Suggested by) event trigger 34 34 100% Time span 69 35 50.7% Actor switch 109 47 43.1% Event trigger & actor switch 0 0 - Event trigger & time span 4 4 100% Actor switch & time span 24 12 50%

One hundred and eighty-four boundaries of action knowledge chunks were suggested by

SPA with chunking heuristics, among which 34 chunks were suggested by the event-trigger heuristic, 69 chunks were suggested by the time-span heuristic, and 109 were suggested by the actor-switch heuristic. Overlaps existed as certain instructions may have triggered multiple chunking heuristics. After an examination by the domain expert, 100 chunks suggested by SPA were confirmed without any modification. Then the acceptance rate for each single chunking heuristic was calculated. The results show that action knowledge chunks suggested by either the event-trigger or the time-span heuristic have higher acceptance rates (100% and 50.7% respectively) than the acceptance rate of action chunks suggested by SPA with the naïve approach

(44.2%). However, the acceptance rate of the chunks suggested by the actor-switch heuristic

(43.1%) is lower than the acceptance rate of those suggested by SPA with the naïve approach

(44.2%). The event-trigger and time-span heuristics are more effective than the naïve approach, which extracts action knowledge chunks with a fixed size. However, the effectiveness of the actor-switch heuristic cannot be approved in this evaluation.

144

6.7.3 Comparing Naïve-approach Chunks and Heuristic-suggested Chunks

After the data analysis of the user acceptance rates, an example showing how action knowledge chunks were extracted from an operating procedure is presented in Table 6-16. The example compares the chunking outcomes generated by SPA with chunking heuristics, with those generated by the SPA with naïve approach, and with the final action knowledge chunks examined and modified by the domain expert. In this table, the line numbers and detailed instructions are listed in the first two columns. The chunk boundaries suggested by SPA with chunking heuristics are presented in the third column. The chunking heuristics triggered to create the chunk boundaries are also listed. The fourth column displays the chunking outcomes suggested by SPA with the naïve approach. In this experiment, chunk boundaries were created for every four lines of instructions under the naïve approach. In the last column, the chunk boundaries examined and modified by the domain expert are listed together with the chunk names input by the expert.

145

Table 6-16. Example of chunking procedure by SPA with chunking heuristics and by SPA with the naïve approach. Examined by with Chunking with Naïve Instruction(s) Domain Heuristics Approach Expert 1 CO* 1. Notify Zone B, Zone C, and Zone E that Instrument Air pressure is Notify other decreasing. Event trigger units 2 Zone B , Zone C, and\or Zone E personnel will start\load their Instrument Air 3 ZN*\CO 2. As Instrument Air pressure begins to decrease, open nitrogen cut- Chunk 1 N2 to ins when pressure reaches approximately 65 psig. instrument air Actor switch 4 ZN\CO 3. Verify the Reactor Charge Heater H-901 fuel gas emergency shut- off valve XV-9261A and fuel gas control valves FV-9028A\B are closed Secure charge 5 ZN 4. Block in Reactor Charge Heater H-901 fuel gas control valve FV-9028 heater A\B and burner block valves. 6 ZN\CO 5. Verify the Fractionator Reboiler H-902 fuel gas emergency shut- Secure off valve XV-9508A and fuel gas control valve FV-9425 are closed. Chunk 2 fractionator 7 ZN 6. Block in Fractionator Reboiler H-902 Fuel gas control valve FV-9425 Actor switch & reboiler and burner block valves. Time span 8 ZN 7. Shutdown Reactor Charge Pumps P-901 A\B. 9 Secure 900 unit ZN 8. Block in discharge of Reactor Charge Pumps P-901 A\B. charge 10 ZN 9. Shutdown Recycle\Makeup Compressor C-901 A\B. CO 10. Monitor reactor bed temperatures and if they start to increase, open 11 Time span Chunk 3 the low rate depressing valve, HV-9045, with the hand controller. Prevent CO 11. Monitor reactor bed temperatures and if they increase 50 F above runaway 12 Actor switch normal operating temperature open the high rate depressing valve, HV-904. 13 ZN 12. Shutdown Wash Water Pumps P-902 A\B. 14 ZN 13. Block in discharge of Wash Water Pumps P-902 A\B ZN 14. Verify HP Amine Absorber T-901 lean amine flow control valve FV- Bottle in 900 15 Time span Chunk 4 9023A is closed. Unit ZN 15. Block in HP Amine Absorber T-901 lean amine flow control valve 16 FV-9023A 17 ZN 16. Shutdown the Fractionator Overhead Pumps P-905 A\B when the Time span Chunk 5 Secure 900 Unit

146

level starts to drop in the Fractionator Overhead Accumulator D-909. 18 ZN 17. Shut down C-902 A\B. Sour gas Compressor. 19 ZN 18. Shut down the Fractionator Pump Around pumps. P-908 A\B ZN 19. Shutdown the Fractionator Bottoms Pumps P-907 A\B when level 20 starts to drop in the Fractionator T-903.

21 ZN 20. Shut down Burner Fuel Product Pumps P-906 A\B. Chunk 6

* CO, ZN refer to two different positions of operators. CO: console operator; ZN: field operator

In the third column of Table 6-16, of the seven chunk boundaries suggested by SPA with chunking heuristics, four were examined and saved by the domain expert (acceptance rate =

57.1%). Only one chunk boundary was suggested by the event-trigger heuristic, and this was accepted by the expert (acceptance rate = 100%). Four chunk boundaries were suggested by the time-span heuristic, of which two were accepted and modified by the expert (acceptance rate =

50.0%). In addition, three chunk boundaries were suggested by the actor-switch heuristic, of which two were accepted and modified by the expert (acceptance rate = 66.7%). In comparison, in the fourth column of Table 6-16, three of the five chunk boundaries suggested by SPA with the

4-line chunking naïve approach were accepted and modified by the domain expert (acceptance rate = 40%). In this example, the acceptance rates of all the chunks suggested by the chunking heuristics were higher than those suggested by the naïve approach suggesting chunks with a fixed size of 4 lines.

Table 6-17. Outcome of example procedure. Chunking Heuristics Naïve Approach Total number of chunks 8 6 Av. size of chunk (# of Line) 2.6 3.5 Max. size of chunk (# of Line) 6 4 Min. size of chunk (# of Line) 1 1

Table 6-18. Acceptance rates of chunks extracted from the example procedure. Chunking Heuristics Naïve Approach Accepted chunk boundaries 4 2 Acceptance rate 57.1% 40.0%

148 Table 6-19. Acceptance rate of each chunking heuristic for the example procedure. Action Knowledge Chunk Accepted by Suggested by SPA Acceptance Rate Boundary Subjects Total number 7 4 57.1% (Suggested by) event trigger 1 1 100% Time span 4 2 50.0% Actor switch 3 2 66.7% Event trigger & actor switch 0 0 - Event trigger & time span 0 0 - Actor switch & time span 1 1 100%

Let us briefly discuss the details of the processing procedure and the chunking outcomes.

First, the action knowledge chunks created by the naïve approach without the rationales of the chunking heuristics were not accepted by the user (the domain expert). For example, Chunk 2 and

Chunk 3 suggested by SPA with the naïve approach were separated between Instructions 8 and 9 to ensure that each chunk contained four lines of instruction. However, no event trigger occurred in the environment, nor was there a change in operation timing, nor a switch of operators in these two operations. No chunking heuristic was triggered to create a chunk boundary between

Instructions 8 and 9. Furthermore, in these instructions, both the “shutdown” operation and the

“block in” operation were implemented on Pumps P-901 A\B to “Secure 900 unit charge” (the chunk name given by the domain expert). Therefore, the domain expert considered extracting an action knowledge chunk containing the two operations in Instructions 8 and 9, which is consistent with the suggestion from SPA with chunking heuristics but not with SPA with the naïve approach.

Second, users prefer to accept action knowledge chunks created by chunking heuristics, over those suggested by the naïve approach. The second chunking example would be the chunk boundary between Instructions 10 and 11. The operator who is to implement the operation behavior switches from ZN (field operator) to CO (console operator), which triggered the actor- switch heuristic. And, the condition of “monitor … if they start to increase” indicates a period of time between the current and follow-up operations, which triggered the time-span heuristic. Both

149 heuristics suggest a chunk boundary between Instructions 10 and 11. The domain expert accepted this suggestion. However, it should be noted that at this point there is no suggestion from SPA with the naïve approach.

In summary, this procedure chunking example confirms my prior conclusions that first, both chunking heuristics and the naïve approach are not perfect for the purpose of procedure chunking. Users must modify the chunk boundaries suggested by SPA with either approach.

However, the chunking heuristics provide better solutions (with a higher user-acceptance rate) than those suggested by the naïve approach. SPA with chunking heuristics is more effective than without these heuristics. Second, the chunking heuristics accord with chunking rationales. Both the evaluation data and the chunking example reveal that users have a higher intention of accepting action knowledge chunks suggested by an event trigger than those suggested by either the time-span or the actor-switch heuristic.

6.8 Discussions and Conclusions

In this essay, I reported evaluation efforts and outcomes following three distinct directions to assess the effectiveness of SPA: an approach I designed to extract and chunk explicit action knowledge from procedures. I first demonstrated how the approach works with a descriptive illustration. Next, I conducted an assessment with a small expert panel to collect feedback about the usefulness of SPA. Finally, I evaluated both the effectiveness of the learning mechanisms as a part of Phase 2 and the contributions of heuristics to Phase 3. Taken together, these evaluation efforts not only demonstrate that SPA can help extract and chunk action knowledge but also verify that the learning mechanisms can improve the accuracy of tagging in

SPA, and the contribution of heuristics to create appropriate action knowledge chunks.

150 I acknowledge that there are still some limitations to the evaluation efforts I have reported. For example, the evaluation of learning mechanisms for Phase 2 could be criticized for using training and testing procedures from the same petrochemical site. The evaluation of the heuristics in Phase 3 could be criticized for relying on a single expert. Despite these limitations, the combined evaluation outcomes provide an indication of the potential of this approach for managing codified action knowledge in the petrochemical industry.

151

Chapter 7

Essay 3 – Operator Strategies to Use Action Knowledge in Support of Tasks

Chapter Organization

This essay describes the derivation of a tentative theory about the strategies operators use to combine tacit and explicit action knowledge. I explore how operators use the codified procedures and other cues to guide their behaviors. The data is gathered via think aloud protocols, which are analyzed via iterative cycles following a grounded theory method. The work in this essay provides a counterbalance to the investigation of explicit action knowledge in the first two essays. The essay is, thus, motivated by a desire to understand how action knowledge is manifested in operator behaviors, as opposed to how it has been codified in operator procedures.

The tentative theory I develop is intended to serve as a platform for future work such as comparison with models of work practice codified in procedures. Like the first two essays, this essay is also developed for independent reading, and therefore, utilizes (and reframes) some of the material from the early parts of the dissertation within this essay. The next section, Précis, serves as the abstract for the Essay and a brief outline of other sections that follow.

7.0 Précis

In this study, I investigate operators’ work practices as a counterbalance to analyzing codified action knowledge in operator procedures. The research objective is shown in Figure 7-1.

I develop a tentative theory of operator strategy as they engage in their daily work in processing industries. This tentative theory is aimed at explicating: (1) what action knowledge operators use

152 to guide their behaviors (actions); (2) what strategies operator use to select and implement operation behaviors as they carry out their tasks; and (3) what factors impact the adoption of strategy of using action knowledge by operators. These provide important additions to the model of operator action codified in operator procedures. To develop the theory, I follow a grounded theory approach. The data is collected from a think aloud protocol following a critical incident scenario. The analysis suggests that in processing industries, operator behaviors are guided by different forms of action knowledge. I describe potential implications of the findings in terms of possible extensions to codified action knowledge in operator procedures, novel representation schemes for action knowledge, and training programs for operators in processing industries.

Organizatio

Example: Example: Procedures Team work practices

Explici Tacit Example: Example: Personal Operator memos expertise

Individ Action

Figure 7-1. Research objective for Essay 3.

7.1 Introduction and Motivation

Action knowledge is defined as knowledge that enables operators to select action(s) for changing the current state to a goal state (Bera & Wand, 2009). It can be codified within

153 procedures codified in an organization: as explicit knowledge, or possessed as expertise by experts within an organization: as tacit knowledge. Within organizations in the process industry, the later form of action knowledge often serves as a supplement to complete the action knowledge codified in procedures. This, tacit action knowledge, is difficult to hand down across operators via written down procedures. Instead, operators learn tacit action knowledge by reflecting on the tasks they perform or by observing it as other (often more experienced) operators perform different tasks. Because it is difficult to hand down, understanding the nature of expertise (in this case, tacit action knowledge), is an enduring problem in processing industries

(Chi, Glaser, & Farr, 1988). More specifically, in the petrochemical industry, this is a pressing concern because of a looming wave of retirement that is expected to reduce the ranks of operators by as much as 20% in the next several years (Strahan, 2005). Expertise in the form of tacit action knowledge will be lost as these expert operators retire. To address these concerns, this study aims at investigating the nature of operator expertise, tacit action knowledge, following an empirical research strategy. I recognize, however, that operators do not use this tacit knowledge in the absence of operator procedures (explicit knowledge) that are often part of the organizational mandate. As a result, I address the following research question –What strategies do operators use to apply action knowledge – both tacit and explicit - as they engage in the performance of tasks?

The goal of the work outlined in this essay is, thus, to identify how operators use action knowledge during the performance of tasks. I develop this as a number of core categories and sub-core categories of action strategies that describe the process of using action knowledge that guides operator behaviors. The discovery is based on an analysis on data obtained from operators

(Strauss & Corbin, 1990). The findings are mapped against prior research (Bera & Wand, 2009;

Caldwell & Garrett, 2007). These comparisons validate and expand our understanding about operator strategies as they carry out their work. The overall outcome of this study is, therefore, an

154 empirically grounded, yet nascent theory of how operators use action knowledge to carry out certain behaviors, which in turn allows them to perform tasks.

I outline the work completed in remaining sections. Section 2 describes research setting for the study and the rationale. In Section 3, I review prior work on action knowledge and operator behaviors. Section 4 introduces the grounded theory research method, and the research instrument used for data gathering with use of critical incident technique and think aloud method.

I describe the iterative of data analysis and theory generation effort in Section 5. The findings are discussed along with a mapping to prior research in Section 6. Section 7 summarizes the contributions and limitations of this study.

7.2 Research Setting

Processing industries are characterized by operations that are run either as continuous or batch processes involving material transformations. Examples of the processing industries include food, beverages, chemicals, petrochemicals, coal, paper products, etc. Firms in process industries are concerned with processing resources with formulas and recipes into other products (Makins,

1991). In processing industries, the relevant factors are ingredients, not parts; formulas, not bill of materials; and bulk of operation processes, not individual units11. Once an output is produced by this bulk of continuous operation process, it cannot be distilled back to its basic components.

In processing industries, human operators play a critical role. They monitor operations, react to emergencies and failures, optimize processes and recovery, and coordinate maintenance and repair tasks (Schragenheim et al., 1994). Operators perform these tasks every day, around the clock, coordinating their work across multiple shifts (Isermann, 2006). During their tenure at processing industries, operators acquire and cultivate expertise that is difficult to imitate or

11 http://en.wikipedia.org/wiki/Process_manufacturing

155 acquire in a short time for newcomers (Strahan, 2005). Without this expertise, the operations can suffer and processes can derail, lead to accidents, causing significant damage to assets, and worse, to human life. In recent years, a number of technological advances have been made and automation systems introduced that have significantly changed the role of the operator (Ernst &

Lundvall, 2004). These high-tech production processes themselves (as well as the increase in government oversight and regulation) have forced the operators to acquire more knowledge and skills, and they have prompted firms in process industries to document and codify practices to make the operators less error-prone and more effective (Grote, 2008; Rowe, 2009). The phrase

“action knowledge” refers to the knowledge and skills that these operators possess and exercise to ensure that the plants in a process industry are run effectively and without any serious problems.

Figures 7-2 and 7-3 show some pictures to provide the readers a sense of the research setting.

Figure 7-2. Field and console operators in the refineries.

156

Figure 7-3. The setting for the study: a petrochemical refinery.

The specific industry for this study is petrochemical industry. In this industry, operations are focused on continuous processing of crude oil that includes extraction and refinement. The scale of the problem can be appreciated by considering the number of large refineries. Within the continental U.S., there are about 140 refineries, each with about 250 operators. The selection of petrochemical industry as the research setting is based on following two reasons.

First, the consequences of failure in petrochemical industry can cause catastrophic damage. Over the past few decades, although both the technology as well as the government regulation have been developed to improve the safety, major industrial accidents in petrochemical industry are as likely today as they were 10 years ago. For example, since 1998, there has been no reduction in the fatality rate or major accident rate in U.S. or European industry (Wolf, 2001).

Besides, the most dangerous potential consequences of petrochemical industry accidents are environment pollution and harm to innocent bystanders, such as people living close to a petrochemical facility. As the industry size grows, the number and size of facilities including

157 toxicity of the materials and the number of people living or traveling nearby also increase

(Perrow, 1999).

Second, operator error is a big concern in petrochemical industry. According to the accident data in petrochemical industry for 1976 – 2006, the operator error is the second largest causes of loss in this industry (Nivolianitou et al., 2006). Even worse, the high rate of operator turnover becomes a critical issue for petrochemical refineries nowadays. The number of operators in petrochemical industry is predicted to decrease with a significant wave of retirements in the next few years, resulting in the loss of 20% of the operators, effectively clearing the ranks of expert operators.

Understanding how operators use action knowledge – both tacit and explicit – as they perform their tasks in the petrochemical industry is, therefore, an extremely urgent concern. A better understanding may provide avenues for extending existing codification of tacit knowledge, point to ways for better training of operators, and suggest possibilities for organizational practices that lead to more effective use of action knowledge by the operators – ultimately providing benefits in terms reduction of accidents in the petrochemical industry.

7.3 Prior Research

I review related prior research in this section. Two streams of research are pertinent to this research – studies about knowledge, more specifically action knowledge, as the guidance to support operator behaviors; and studies of operator behaviors and analyses of human errors. The research gaps discovered by reviewing this research allow framing the research concerns outlined earlier.

158 7.3.1 Action Knowledge

Action knowledge for the purpose of this research is defined as the knowledge that enables operators to select action(s) for changing the current state to a goal state (Bera & Wand,

2009). It suggests an emphasis that is different from the truth-value of knowledge (Blosch, 2001).

Action knowledge contains any rules, prescriptions, and propositions that providing ability to choose and execute a course of actions. It could be either explicit or tacit (Nonaka, 1994).

Explicit and tacit knowledge is an important precursor to understand efforts that may be undertaken for managing action knowledge. Explicit knowledge is knowledge that has been externalized by articulating, coding and communicating (Hasher & Zacks, 1984) in the form of data, scientific formulas, specifications, or manuals. In this context, example of explicit action knowledge includes well-defined standard operation procedures, which are widely used in processing industries. However, human beings have found that it is sometimes difficult or even impossible to articulate exactly what they know or how to put it into practice. Such knowledge that is hard to articulate is described as tacit knowledge. It includes knowledge that is difficult to transfer to another person in written or verbal form (Hasher & Zacks, 1984); instead, it may be acquired via experience, routines, values or emotions (Schon, 1987). Although difficult to articulate, scholars have found that tacit knowledge can be shared through efforts such as interactive conversations and storytelling under some conditions (Zack, 1999). An example of tacit action knowledge in petrochemical industry is that, operators could determine the hydrocarbon progress based on the smelling in the air, and then select the follow-up operations.

One common misunderstanding in previous studies is that, researchers mistakenly equate the meanings and interchangeably use the terms of “procedural knowledge” (knowledge that embedded in procedures) and “action knowledge”. In fact, there are important differences between them. The concept of procedural knowledge considers the expression of “know-how”

159 knowledge by emphasizing its format as procedures or condition-action rules (Anderson, 1976,

2009; Corbett & Anderson, 1994). This definition makes “know-how” a sub-set of explicit knowledge. However, this definition is contradicted by Nickols’s integration of explicit vs. tacit knowledge with declarative vs. procedural knowledge, which considers that all “know-how” is tacit and all “know-what” is explicit (Nickols, 2000). Neither Anderson’s view (of connecting

“know-how” knowledge with the classification of tacit and explicit knowledge) nor Nickols’s definition is used for the purposes of this research. “Know-how” knowledge can be represented and codified as condition-action rules. In a significant proportion of AI research, condition-action rules (explicit knowledge) are used to govern the actions of robots (Georgeff & Lansky, 1986).

On the other hand, “know-how” knowledge may also be embedded in an individual’s practices as a routine or ability without articulation, such as the ability to ride a bike (tacit knowledge) (Eraut,

2000). With such misunderstanding, although there are numerous previous studies in this research stream, most focus on explicit form of action knowledge, namely “procedural knowledge.” Such studies include developing regulations for clinic diagnosis processes, creating behavior rule system in robot design (Moore, 1977, 1985), and so on. Prior research has seldom investigated action knowledge from a comprehensive perspective including both explicit and tacit action knowledge. This is the focus of my research.

7.3.2 Operator Behaviors and Human Errors

Much prior research related to use of tacit knowledge has been carried out under the umbrella of ‘understanding operator behaviors’ and ‘characterizing human errors’ that may result as a consequence of inappropriate or incorrect behaviors. In particular, operator behaviors can be understood at three levels, depending upon the level of mental processing controlling the behaviors (Rasmussen, 1986). The first level, skill-based behavior, includes those actions which

160 have become "automatic". The second level, rule-based behavior, occurs when one is consciously attempting to reach a goal or to solve a problem. It refers to behaviors where the operator uses a rule, an "if-then" statement, to decide upon the appropriate action. The third level of operator behavior, knowledge-based behavior, is most familiar in a problem solving or troubleshooting setting. In this case, the learned or available rules or routines are not sufficient to specify what to do next, and the operator must rely on his knowledge and understanding of the system to select an appropriate action.

Human error is a deviation from intention, expectation or desirability. Human error is the most common explanation for organization and technology breaking down in process industry

(Tasca, 1989). During the busy and stressful periods of a mission, individuals are placed under immense pressure to achieve goals and resolve unexpected problems; therefore, it is not great surprise that as humans they sometimes make mistakes (Olla, 2006). For instance, 80 percent of significant marine accidents are caused by human errors which is frequently claimed by oil representatives and oil regulators (Tasca, 1989). The concern on human error issues is also existing in the chemical industry (Kletz, 2001). Human error explains socio-technical failure by holding that individuals are incompetent, poorly trained, confused, or do not follow the rules

(Clarke & Short, 1993). It may occur in each level of operator behavior. In the first level (skill- based), operator behavior can be deficient (leading to slips), although the plan is satisfactory; while in the second and third level (rule-based and knowledge-based), operator behavior can go as planned, but the plan can be inadequate (leading to mistakes) (Reason, 1990). Therefore, it may be argued that the first type of human error – slip – is caused by inappropriate using of action knowledge, and the second type of human error – mistake – is caused by inadequate or inappropriate action knowledge.

An important factor for recovering from and avoiding repetition of human errors involves managing organizational knowledge (Olla, 2006). Prior research reveals that human error may be

161 caused by lack of or insufficient tacit knowledge in organizations (Grant & Gregory, 1997), i.e., effective management on tacit knowledge (including skills, experiences, competences, and attitudes of individuals) has the potential of decreasing human errors (Madsen & Mikkelsen,

2012). A number of knowledge management technologies and approaches are designed and implemented to capture and codify knowledge from individuals, teams, and organizations or at least surface this as tacit knowledge (Bharathy, 2006; Cai, 2008; Olla, 2006). However, how individuals use this knowledge – both tacit and explicit, codified in artifact such as procedures – to support their behaviors and reduce errors is still unknown. This missing link is what I explore in this study.

The above review of prior work related to action knowledge and human behaviors points to two critical concerns that I address in this study. First, none of prior studies investigate action knowledge including both tacit and explicit formats. Second, prior studies on human behaviors do not reveal how these forms of knowledge are actually used by individuals when they search for solutions and implement actions in service of their daily tasks. Therefore, in this study, the focus of my investigation is on uncovering the strategies that operators use to understand how these strategies help them use action knowledge and guide their behaviors. I do this by analyzing operator work practice (the research gap is shown in Figure 7-4).

162

Tacit Action Knowledge

Used

Strategies for Using Action Knowledge

Operator ? Used

Codified Procedures

Explicit Action Operator Behaviors Knowledge to Carry Out day to day Tasks (e.g. - Cognitive: monitor pump pressure - Physical: close water valve)

Outcomes - Correct performance of tasks, e.g. the plant is secure - Human errors, e.g. pressure is too high caused by a pump closed Figure 7-4. Research gaps in prior studies.

163 7.4 Research Methodology

7.4.1 Rationale

To address the concerns described above, this research follows grounded theory development method (Eisenhardt, 1989; Yin, 2009). The grounded theory method is “a qualitative research method that uses a systematic set of procedures to develop an inductively derived theory about the phenomenon” (Strauss & Corbin, 1990). The benefit of the grounded theory approach is that the resulting theory is intimately tied to the evidence (Eisenhardt, 1989).

The grounded theory method is especially useful, when little is known about a topic and few adequate theories exist to explain or predict a group of behaviors (Munhall, 2007). The grounded theory method also has the advantage of facilitating the discovery of unknown or unexpected patterns of behavior (Ha et al., 2007). Therefore, the lack of previous theoretical evidence and new research domain encouraged me to use the grounded theory methodology.

7.4.2 Approaches to Conduct Grounded Theory Development

A number of specific approaches are available to carry out a research project aimed at grounded theory development. Although all are concerned with obtaining facts from data and then discovering theory, these different “flavors” of grounded theory are suitable for researchers with different beliefs and satisfy various research requirements12.

The first is a “classic” grounded theory originated by Glaser and Strauss (Glaser, 1978).

The foundation of this grounded theory is that knowledge development begins with knowledge generation rather than knowledge verification. The process is to systematically look at the whole

12 Antoinette McCallin, December 2009: http://www.groundedtheoryonline.com/what-is-grounded-theory/classic- grounded-theory

164 substantive area and it is respectful of the timelessness to generate new theory. The second type of grounded theory has roots in the Strauss and Corbin approach but also provides intricate details about specific research techniques and procedures (Strauss & Corbin, 1990). By following its axial coding model, conditions and dimensions of a situation are investigated during studies.

Therefore, the emphasis of the second type of grounded theory is not only a theory that explains what is meaningful to the participants managing a problem, but also the research process that carefully guide the researchers. The two types of grounded theory are summarized and compared in Table 7-1 (Heath & Cowley, 2004).

Table 7-1. Two types of grounded theory. “Classic” grounded theory Strauss and Corbin’s approach (Glaser, 1978) (Strauss & Corbin, 1990)

Generalization of Wide Partial and conditional theory Parsimony, scope, and Detailed and dense process fully Scope of theory modifiability described Format of theory Concepts or themes Structured concepts Development No Yes process dependent Substantive coding – data Open coding – use of analytic Initial coding dependent technique Continuous with previous phase Axial coding – reduction and Intermediate comparisons, with focus on data, clustering of categories phase become more abstract, categories (paradigm model) refitted, emerging frameworks Theoretical coding – refitting and Selective coding – detailed Final development refinement of categories with development of core, integration integrate around emerging core of categories

In this research, I adopt Strauss and Corbin’s approach of grounded theory (Strauss &

Corbin, 1990). This selection is based on following reasons: First, the Strauss and Corbin’s approach is oriented towards building full descriptions at an individual level of enquiry (Lehmann,

2010). This is preferred over the Glaser’s grounded theory approach that allows a focus on

165 building abstract conceptualizations at an organizational level. Second, the Strauss and Corbin approach provides a more restricted and constrained coding process than “classical” one (Bryant

& Charmaz, 2007). This study is conducted on operators in the domain of petrochemical industry, therefore, the data collected for this research are targeted and focused, and depending on the investigation condition and time. A more restricted process could be applied to code these targeted and focused data. Third, under the umbrella of action knowledge research stream, I seek concepts to structure action knowledge model and connect this model to human behaviors.

Therefore the outputs of this grounded theory would be structured concepts generated from

Strauss and Corbin’s approach, other than themes with free format, which is usually generated by the “classical” grounded theory approach. Following the argument, the Strauss and Corbin approach is well suited to identifying the tacit action knowledge used by operators in the processing industries, and adopted for this research. In the next section, I provided details of the research method and process followed.

7.5 Research Process

7.5.1 Research design

The selection of grounded theory requires a specific methodological approach to data collection, analysis, and theory building (Urquhart, Lehmann, & Myers, 2010). In my case, this was manifested in a number of ways. First, data collection and analysis happened simultaneously and categories were constantly contrasted and compared to each other. Second, all types and kinds of data were selected to provide different views from which to understand emergent categories. Established categories were used to direct future data collection through theoretical sampling. And third, prior knowledge of the field was not used to pre-formulate hypotheses to be

166 verified. Instead, preconceptions were constantly questioned to ensure the opportunity for themes to emerge from the data.

As mentioned, I followed Strauss and Corbin’s approach for generating grounded theory

(Strauss & Corbin, 1990), which emphasizes the practice of data sampling, data analysis, and theory development. Considering these requirements, I conducted the research containing the phases of data collection, coding scheme design, data analysis, and theory building – to investigate the existence, use, and exploration of action knowledge, and how it is realized as part of daily work practice for the operators. The details of each phase and instantiations for this study are described next.

To gather and analyze research data, I followed the data collection and analysis process described in Figure 7-5. Data were collected via interviews on operators to empirically investigate my research question. First, an interview protocol was generated, followed by a pilot (with a senior operator) to finalize the interview protocol. The first set of interviews was conducted with operators and audio-recorded before transcribing and analyzing. Coding and analysis were carried out to discover the initial of core categories related to strategies of using action knowledge by operators during their task performance, and the interview protocol was modified. A second set of interviews was then conducted with different operators. An intensive round of coding and analyses followed to discover and verify the initial core categories, until theoretical saturation was achieved that there is no new categories found from collected data. Then in the third round of analysis, based on the findings of core categories of strategies of applying action knowledge, factor that impacts on their adoption was revealed.

167

Develop Interview Protocol

Interview Data 1 - 6 Initial 6 Core Iteration 1 Categories

Interview Data

1 - 18 Verify 6 Core Iteration 2 Categories

Interview Data

1 - 18 Verify Impacts of Experience Iteration 3

Tentative Theory: Operator Strategies

Figure 7-5. Data collection and analysis process.

The core category is often difficult to define in early sampling iterations of grounded theory analysis, because it is grounded in specific mechanisms, contexts, or environments. The analyses, therefore, often involve parallel streams of coding, where fragmented subsets of core category are tested for how they fit with the data. Not only the definition, but the researcher’s understanding of the meaning of the core category tends to evolve across multiple iterations of theoretical sampling as researchers construct and discard hypotheses and theoretical explanations of the situated phenomena that they encounter (Gasson & Waters, 2013). My effort, therefore, involved parallel streams of coding. During the data analysis, I moved from identifying broad issues of behavior (open coding) to identifying a core category that represents the central ideas of

“using action knowledge” in the study, then onto axial coding that explain relationships between

168 the core category and other elements of operators’ work practice, such as goals of practice and environment in which practice happened. Finally I conduct selective coding that dig the details of the core category, including the various formats of action knowledge used to guide operator behaviors.

7.5.2 Data collection

Case Selection: I purposefully selected diverse organizations from a population of organizations in the process industry. Access to these organizations was provided by a Center that was involved in conducting research related to operators in petrochemical industry. Four organizations were selected: three from the petrochemical industry and one from the chemical process industry (see Table 7-2). The selection was based on pragmatics of access.

Table 7-2. List of organizations. Production Size Organization Location Process Industry (barrels per day) Company A 2.6 million West Coast, North America Petrochemical Company B 2.6 million West Coast, North America Petrochemical Company C 0.3 million Northeast Region, North America Petrochemical Company D 0.9 million Midwest Region, North America Chemical

Multiple cases of task performing within each organization are selected. The task performing cases are determined by selected research subjects, since both operator behaviors and use of action knowledge depend on the behavior executors – operators. To access appropriate operator subjects, two rules are followed when selecting interviewee samples.

First, the sample covers multiple roles of operators who cooperate with each other, such as console operators and field operators. Console and field are two different positions for petrochemical operators (Millner, Cochran, & Bullemer, 1999). Many tasks require coordination

169 between operators in these two positions. Second, the sample includes both experts and novices.

This is based on the fundamental assumption of this research – cumulative expertise can support certain work practices that novices cannot perform easily. Comparison of data collected from both experts and novices allows identifies differences between the two, and attributes these differences to “expertise”, namely tacit action knowledge, possessed by the expert operators (The

Recruitment Script is shown in Appendix C). This theoretical sampling among different organizations from both petrochemical and chemical industries enhances the generalizability of our findings. Furthermore, the study on multiple cases from diverse operators allows findings to be replicated within processing organizations. All selected operators are listed in Table 7-3. The

Interviewee ID is referred in data analysis later.

Table 7-3. List of interviewees. Interviewee Length of Organization Operation Role ID Experience (year) 1 1.5 C Field 2 3 C Field 3 3.5 B Console 4 4 A Field 5 6 C Console 6 7 D Console/ Field* 7 8 C Console/ Field 8 10 D Console/ Field 9 15 D Console/ Field 10 17 D Console/ Field 11 19 D Console/ Field 12 20 D Console/ Field 13 21 C Console/ Field 14 22 C Field 15 27 B Field 16 28 B Field 17 33 A Console 18 36 A Console

*Console/ Field: operator performs as both console and field role rotationally

170 Crafting the Interview Protocol: I employed interviews as data collection method for this study. During the interview approach, critical incident technique and “talk aloud” technique were used. The Critical Incident Technique is a technique to collect direct observations of human behavior to facilitate their potential usefulness in solving practical problems (Flanagan, 1954). It is widely applied in the analysis of human error and the assessment of operator performance, for the purpose of seeking the model of workers’ knowledge and cognitive activities utilized to perform complex tasks in the work domain (Bonaceto & Burns, 2007; Hoffman, Crandall, &

Shadbolt, 1998). To be critical, an incident must occur in a situation where the purposes or intents of the action are outlined (Flanagan, 1954). In this research, the critical incident technique was used with a critical incident scenario of loss of power defined in collaboration with a domain expert:

“A thunderstorm has come through the area and resulted in a momentary loss of power to the entire complex. All motor driven equipment (pumps, fans, blowers, compressors) have stopped. The power has now returned.”

The selected operation task was perceived as complex by operators and required the combination of standard procedure and their experience.

Operators were asked to select appropriate procedure to deal with the given critical incident scenario – to recover the power losing incident, select and implement behaviors based on procedures, and “talk aloud”(Ericsson & Simon, 1980) while doing so. I took notes of everything that operator said, as well as audio-record the whole conversation. Talk-aloud protocol enables to see first-hand the process of task completion rather than only the final results of actions (Ericsson

& A., 1993). Operators were required to give explanations when describing their decisions and implementations of behaviors, so I could understand which action knowledge were used by operators to guide certain operator behaviors.

171 At the beginning of each interview, I obtained informed consent. The interview procedure began with questions about the operator’s background and daily routine. After that, the operator was asked to describe what actions s/he would take in response to the critical incident of “loss of power,” which was described and a printout was made available to the operator. Each operator was asked to reflect on how s/he would deal with the aftermath of the scenario. Clarification questions and probing questions helped elicit further details during this process. The operator was probed to identify perceptual cues and prior knowledge used in the decision making and behaviors that would follow, as well as alternative decisions and actions that s/he would consider.

The interaction was guided by an interview protocol (see summary in Appendix D). Table 7-4 shows the data collection efforts for each visit.

Table 7-4. Data collection. ~Words of Interviewee ID Length (min) Transcript 1 69 5600 2 48 4500 3 55 6100 4 46 7700 5 49 4500 6 43 3500 7 31 7200 8 29 3800 9 47 6900 10 26 4300 11 59 3500 12 47 5200 13 50 10000 14 41 5600 15 35 4800 16 27 6600 17 37 6100 18 36 5600

172 In total, eighteen interviews were conducted and recorded with an average of 43 minutes in length and an average of 5600 words in terms of transcribed length. The contents of conversation include description of task performing process and answers to interview questions.

A fragment of interview transcript is provided in Appendix E. On average, eight hours were required to transcribe each interview. A total of 185 pages of transcripts with 102 thousand words are recorded. These transcripts are the principal source of data for the data analysis.

7.5.3 Data analysis

The data analysis consisted of three steps: open coding, axial coding, and selective coding. I introduce details of each step in the sub-sections below.

7.5.3.1 Open coding

Open coding also is conceptualizing on the first level of abstraction in grounded theory method. Open coding is the part of the analysis concerned with identifying, naming, categorizing, and describing phenomena founded in the text. In the beginning of a study, everything is coded in order to find out the problem and how it is being resolved. Therefore, when doing the open coding, written data from transcripts were conceptualized line by line. During this phase I was conceptualizing all the incidents in the data, which yields many concepts. These were compared as more data were coded, and merged into new concepts, and eventually renamed and modified.

To do this, basically, I read through data several times and then start to create tentative labels for chunks of data that summarize what I saw happening. I also recorded examples of interviewees’ words and established properties of each code. These labels refer to things like close valves, compressor, stream pumps of compressor, console operator/ field operator, water side, etc. They

173 are the nouns and verbs used in petrochemical plants. Part of the analytic process was to identify the more general categories that these things are instances of, such as operation activities, equipment, parts of equipment, participants, locations, consequences of operations, etc. I also sought out the adjectives and adverbs – the properties of these categories. For example, about a failure (or potential failure) I might code about its duration, severity, and importance to rest parts of plants. An example of open coding on a segment of interview transcript is given as following:

“When you lose power [“incident of power losing”], you don't know how long it's going to be down. We [“field operator”]'re assuming it's going to be a long time hours [“simulating extreme condition”]. So first thing [“first order of action”] we do we have a procedure in place what we have to do in the plant [“following SOP”]. We have to secure certain items in the plant [“securing plant”], I mean right now [“timing to implement action”]. And we will make a phone call to our OS (operator supervisor) [“supervisor”] or the shift supervisor [“shift supervisor”], and then determining how long the power is off and on [“communicating with others to make decision”]. I am saying power can be off now, but it will be back within 30 minutes [“estimating severity of incident by experience”], we still make that call [“deciding action based on estimation”].”

I also wrote notes against labels that called “memo”. A memo could contain a paragraph or even more if needed. If taking a closer look in to the line-by-line coding example above, the interview transcript has more meaning than expressed in the code. With memo, I recorded information as following:

Memo: Things discussed here are operation behaviors in the plant to deal with failure of power losing. In order to solve this failure, operators need to secure certain equipment in the plant. One approach to secure plant are contained in procedure and recalled by the interviewee. Implied in the text is that the interviewee views this failure of power losing as having certain properties, one of which is length: it can last from a few seconds to several hours. After the securing operation, the interviewee communicates with supervisor and makes assumption on the length of power losing. The adjectives reveal interviewee’s belief on priorities of operation behaviors. In this segment of text, the operations of securing plant have higher priority, which must be taken immediately. Later, the behaviors of communication with supervisor and making assumptions could be done later with a lower priority.

174 Following the same approach, all data gathered in first visit (6 interview transcripts from two petrochemical organizations) were coded with relative categories. A partial list of coded labels created in this process is shown in Table 7-5.

Table 7-5. Partial list of labels created during open coding. 1. Event 1.01 Incident of power loss 1.02 Power back 1.03 Out of water …

2. Actor 2.01 Field operator 2.02 Console operator 2.03 Supervisor … 3. Action 3.01 Checking SOP * before operation 3.02 Communicating with others to make decision 3.03 Estimating severity of incident with information on screen of monitor 3.04 Simulating extreme condition with experience 3.05 Following SOP to secure plant … 4. Action Knowledge 4.01 SOP 4.02 Information on monitor screen 4.03 Experience of the same/ similar power failure incident …

However, I realized that addressing the research concern: “what strategies do operators use to employ action knowledge, tacit and explicit, to carry out their work” would require more than these labels and memos. To provide further insight, I went back and forth between the data and the codes, constantly modifying and sharpening the growing theory via the steps of axial and selective coding.

175 7.5.3.2 Axial coding

Axial coding is defined by Strauss and Corbin (Strauss & Corbin, 1990) as “a set of procedures whereby data are put back together in new ways after open coding, by marking connection between categories.” In another word, axial coding is the process of relating codes

(categories and properties) to each other, via a combination of inductive and deductive thinking.

Based on codes generated in prior process, I concluded that the majority of the open coding labels are concerned with:

1. the purpose, goal, or target states that motivate operator behaviors; 2. the domain, the set of all statements about the context (background), which includes but is not limited to the equipment status; 3. the events that lead to the operation behaviors; 4. the action that applying action knowledge, interacting with environment, and reflecting on performed tasks; 5. the action knowledge referred by operators to guide their behaviors; 6. the actor(s) who possesses knowledge and information and who executes actions; and 7. the consequences of operation behaviors.

In the text segment above, I observe that the event leading to the operator strategy is

“power loss in the plant,” the goal is “securing plant,” the operator’s action are “following procedure to secure plant,” “informing supervisor of the emergency,” and then “estimating the severity of this emergency” (length of the power losing). The action knowledge used via these strategies includes “standard operation procedure,” “opinion from supervisor,” and “previous experience.” The actor is a “field operator” who also collaborates with his “supervisor.” The interviewee does not mention the background of equipment status and consequences of his operation behaviors in this example.

With the axial coding process, coding labels are connected so I could get a better understanding of the interview data. However, the existing coding labels and their relationships still do not answer my research question: What strategies do operators use to apply action

176 knowledge – both tacit and explicit - as they engage in the performance of tasks? Strategies of using knowledge for certain purpose connect action knowledge and human behavior. On the one hand, such strategies refer to the different types of action knowledge, including both tacit and explicit. On the other hand, they describe specific behaviors of an individual operator.

Consider the following example: when solving math problems, a child could consult multiplication tables in textbook (explicit knowledge). Or she could count her fingers or perform arithmetic in her head (tacit knowledge) of instantiating numbers into objects. The specific strategies, based on the use of textbook (explicit knowledge) or instantiating numbers into objects

(tacit knowledge) may include gross-level strategies such as “draw on techniques from the book” vs. “do mental math” or specific ones such as “move unknowns to one side for solving an algebraic equation” vs. “simplify the fractions.” The specific behaviors may then include reading textbook or counting fingers, following her selection of strategy. The analogy is useful to understand how these concepts may play out in the research domain of petrochemical industry.

For example, previous experience such as “starting a group of three compressors by opening each of them from zero to maximum sequentially will cause system imbalance” may be tacit action knowledge possessed by operators that would complement the instructions codified in the standard operating procedures. The strategy of recalling such experience would result in the operator’s minute adjustments back and forth on each compressor until all three compressors are at maximum level. The example tells that the strategy used by operator to apply action knowledge as they engage in the performance of tasks is embedded in operator’s actions as well as they refer to certain action knowledge.

As results of open coding and axial coding, some coding labels in the list above are highly related to strategies of using action knowledge, which are labels of action and action knowledge. For each action taken by operator, it could be cognitive or physical (e.g. cognitive action of monitoring screen, and physical action of closing valves), and it may or may not refer to

177 action knowledge (e.g. referring to action knowledge such as recalling procedures). Strategies of using action knowledge adopted by operators are embedded those labeled action. Therefore, I refine my coding target and conducted following selective coding process.

7.5.3.3 Selective coding

Selective coding is the process of choosing one category to be the core category, and relating all other categories to that core category. The essential idea is to develop a single storyline around which everything else is draped (Borgatti, 2005). The core category explains the behavior of the participants in resolving their main concern. Selective coding could be done by re-reading the transcripts and selectively coding any data that relates to the core identified on earlier stage or by coding newly gathered data. Also, I selectively sampled the data with the core in mind, which is called theoretical sampling. Selective coding delimits the study, which makes it move fast (Glaser, 1998).

In this research, I derived the core category based on the connections of codes developed in axial coding above. An observation on coded data revealed that there are three types of action taken by operators: the physical actions interacting with environment, the strategy to use action knowledge to guide operator behaviors, as well as the reflection action to generate new knowledge after task performance. The physical actions interacting with environment are contributed by strategy of using action knowledge. Besides, the format of new knowledge generated by reflection action could be either explicit, such as notes or reports written down by operators, or tacitly stored in operators’ , e.g. operator experiences. And it could be action knowledge, e.g. someone could walk through and listen if there is water drops to discover leaking problem; or declarative knowledge, e.g. the emergency water pump is the red one located under Pump 31. Considering the core focus of “strategy using action knowledge to perform tasks”

178 as well as the definition of action knowledge as knowledge enables the operators to select and perform action(s), I decided to take “strategy to apply action knowledge” as the core category for the selective coding phase. Therefore the core category focused in this study was defined as the strategy to use action knowledge to guide operator behaviors which is a sub-set of actions in open coding process.

I selectively coded all eighteen interview transcripts data with the core category. In this round of selective coding, the unit of analysis was defined as a complete sentences or sentence fragments as appropriate which describe a physical or mental action, namely the process of selecting or implementing an operator behavior, under the guidance of action knowledge. Two researchers (including me) participated to code the transcripts independently. The analysis was done in an iterative manner. As a first pass, we coded the six translated interviews to generate fragments and sub-core categories of strategy. Initially, the codes had represented simply the various operator behaviors within task performance processes. After the first coding, we began to note that some fragments of strategy raise very similar issues. As we developed our coding schema to represent operator adoption of strategies to use action knowledge, a more comprehensive set was needed. Therefore, we aggregated these fragments in to sub-core categories, and then aggregated them into core categories of strategy in our initial coding schema.

During this process, if any sentences or sentence fragment couldn’t be classified into existing sub-core categories of strategy, or any sub-categories could not be aggregated into existing core categories, new sub-categories or core categories were generated. The coding results were reviewed by another researcher in this project. All agreements and disagreements on the coded data were shared between the participating researchers. Any disagreements on the granularity of units identified as well as their classification following the core categories of action were resolved by a process of negotiation and logical argumentation. Rules were established and followed in this coding process, which including: (1) identification of verbal protocols as a phrase or a

179 sentence indicating a reasonably complete unit of action, (2) classification of each unit into one sub-core category of action, and (3) classification of each sub-core category into one core category of action.

In summary, a total of 1140 fragments were identified from the eighteen interviews, with average of 63 fragments mentioned by each interviewee. After coding conducted by multiple researchers, fragments are classified into 130 sub-core categories of strategy. Then they were aggregated into 6 core categories of strategy in the coding scheme. This clustering process from fragments to final core categories is reflected in Figure 7-6. For more details of core categories and sub-core categories of strategy, please refer to Appendix F.

… Fragment Fragment 1 Fragment 2 Fragment 3 1140

Sub-Core Sub-Core … Sub-Core Category 1 Category 2 Category 130

Core … Core Category 1 Category 6

Figure 7-6. Development of core categories of strategy.

I realized that the various strategies immediately generated contingency-factors. I decided to systemically search the interaction of different core categories by examining the correlation between them and other code categories. Initially, I analyzed data using the Atlas.ti qualitative data analysis software package (see Figure 7-7), which allows associated code labels with data samples and to use hypertext links to relate core categories and other open codes. The software

180 also provided a view of all labeled sub-core categories of strategy, which facilitates the aggregation and summarization of them.

Figure 7-7. Analyzing data with Atlas.ti qualitative data analysis software package.

181

Then I switched to using a spreadsheet to summarize this data analysis. It is easy to view and analyze multiple codes simultaneously (e.g. action strategies of one operator, corresponding operation instructions defined in procedures, time-series order, location/equipment parts of action, cause and consequence,). Furthermore, I could connect operators’ actions with their background. This facilitates analyzing the interaction between adopted strategies of using action knowledge and factors of operators’ background, such as operator experience in the field.

7.5.5 Iterations to generate theory

I generated a tentative theory of action knowledge by iteratively analyzing interview data.

I started by analyzing the operators’ behavior and their use of action knowledge in a single organization and then analyzed more organizations from the same industry and from different industries. Theoretical saturation was achieved by revisiting samples analyses from previous iterations in order to identify emergent relationships until the theoretical constructs were consistent across the data sampled. Table 7-6 relates successive iterations of the study to the logic according to which I selected data for analysis, the evolution of the core categories (the categories on which the theory is centered), and the emergence of the theory.

182 Table 7-6. Three iterations and theory emergence across study.

Iteration Data Analysis Findings Exploring coding A qualitative analysis of Six sources of action knowledge are schema of core operation task performance in two identified. categories and sub- petrochemical organizations. core categories of action Emerging Theory in Iteration 1: Operators use action knowledge from different sources to support their behaviors to perform tasks. These sources of action knowledge include (a) standard operating procedures, (b) domain knowledge in documents other than procedures, (c) pre-defined goals, (d) previous experience in similar tasks, (e) status of environment resources, and (f) status of operator self and other co-workers. The operators’ own use and reflection represents an important precursor to the manifestation of actual operator behaviors. Verifying core Additional data from new Six sources of action knowledge categories of action organizations are used. A are confirmed. Six strategies for generated in the first qualitative analysis allows using sources of action knowledge iteration evaluating the findings generated are discovered. in Iteration 1. Then the discovered core categories are compared via a quantitative analysis. Emerging Theory in Iteration 2: Operators use of action knowledge from six sources, using and reflecting on these as a precursor to behaviors are confirmed (identified in the previous iteration). Operators employ specific strategies for using action knowledge from these sources. These include: (a) procedure driven strategy, (b) domain knowledge driven strategy, (c) goal driven strategy, (d) experience driven strategy (e) environment resource driven strategy, (f) operator status driven strategy. These are elaborated through the data analysis. Exploring the The same data set of Iteration 2 is Six sources of action knowledge impaction of used. A quantitative analysis are confirmed. Six categories of operator experience allows connecting the discovered operator strategies for using action on operator’s core categories of action with knowledge (in three broad adoption of action operator experience. categories) are confirmed. These strategies are mapped against operator experience. Emerging Theory in Iteration 3: Operators use action knowledge from six sources. The operators’ use of six strategies to apply action knowledge is confirmed (identified in the previous iteration). The operators’ use of these strategies is influenced by operator experience. As the operator experience increases, they appear to reflect and learn more from their own behaviors compared to codified instructions.

I describe the details of each theory iteration, including how I analyzed the data and generated the theory.

183 7.5.5.1 Theory iteration 1

The core category is defined as a strategy adopted by operators to use action knowledge in order to provide instruction regarding the action required of operators. This category constitutes a sub-set of actions coded via an open-coding process. I focused on this single core category in the following theory-building iterations. Action strategies adopted by operators could be investigated through many perspectives. However, given that my research centers on a

“strategy to use action knowledge,” I reread and coded the first set of data (6 interview transcripts from two petrochemical organizations) with the perspective that action knowledge drives the behavior selected and performed by operators. The outputs of this coding process comprise fragments of quotas from interview transcripts, including phrases and sentences, which describe the processes via which operators select and implement behaviors in accord with certain action knowledge. Specifically, each fragment includes both action taken by operators and the action knowledge they used in taking that action. After generating selective codes for the task- performing processes from the six interview transcripts, I compared and contrasted the fragments and began to note that they raised very similar issues from the perspective of action knowledge. I continued to group these fragments into sub-core categories and core categories of strategy by aggregating similar sources of action knowledge.

In summary, I identified a total of 414 fragments in the interview data (interview transcripts 1–6) in this iteration, with an average of 69 fragments for each interview transcript.

After coding performed by two researchers, 128 sub-core categories of action were created from these fragments, which were then aggregated into 6 sources of action knowledge to support operator behaviors: (a) standard operating procedures, (b) domain knowledge in documents other than procedures, (c) predefined goals, (d) previous experience in similar tasks, (e) the status of environment resources, and (f) the status of the operator and other co-workers. Connected with

184 the definitions of explicit and tacit action knowledge, I concluded that the knowledge embedded in the first two sources of action knowledge (procedures and domain knowledge) constitute explicit action knowledge, whereas the rest of the sources of action knowledge (goals, experiences, environment resources, and operator and co-worker status) constitute tacit action knowledge.

By the end of theory iteration 1, the definitions of the core categories had evolved:

Instead of being defined as cognitive or physical action taken by operators in order to perform tasks, the core category was conceptualized as strategies adopted by operators to use the six sources of action knowledge. In the next theory iteration, strategies to use action knowledge were generated and then the comprehensiveness of the discovered core categories was verified.

7.5.5.2 Theory iteration 2

According to the sources of action knowledge identified in the prior iteration, the core category and sub-core categories of strategy using action knowledge were generated (Appendix

F). To verify these, I examined data from twelve interview transcripts in the second iteration.

Following the same coding criteria, I identified a total of 726 fragments from the rest of the interview data with an average of 60 fragments for each operator. After coding by two researchers, only 2 new sub-core categories were generated from the extra interview data (sub- categories of 4.07 and 6.20 in Appendix F). Both new sub-core categories were assigned to the six existing core categories. Therefore, no new core category of strategy was generated in this iteration. The six core categories of strategy are (a) procedure-driven strategy, (b) domain- knowledge-driven strategy, (c) goal-driven strategy, (e) environment-resource-driven strategy, and (f) operator-status-driven strategy (see Table 7-7). The compositions of each code category are shown in Figure 7-8.

185

Figure 7-8. Composition of core categories of strategy for the use action knowledge.

In this theory iteration, I evaluated core categories of strategy in both the petrochemical industry and the chemical industry. Corresponding to the six categories of action knowledge resources, there are six categories of strategy that use them.

The findings concerning the six core categories of strategy reveal that although there are well-defined procedures to follow operators still refer to other formats of action knowledge to support their behaviors. There are many examples including that operators do not behave according to the instructions described in procedures, but may implement instructions in certain situations in a different order to that specified. Or, operators may engage in behaviors when performing a task that are not specified in the procedures. All these examples suggest that operators are able to select and implement behaviors based on goals, environment status, operator status, and their own experience.

Furthermore, I also found that the usage of certain categories of strategy may differ among operators. For example, the strategy driven by human-resources-related action knowledge is the most frequent action knowledge used by operators (the percentage of operator-status-driven

186 strategy is specified as 24.2% in Figure 7-3). Although it is the most popular strategy adopted by operators, the use of such action knowledge is diverse in nature. The percentage of human- resources-driven strategy among all the actions performed by interviewees could vary from 8.8%

(Interviewee 15) to 48.6% (Interviewee 17). The adoption of a strategy whereby certain action knowledge is used may be affected by a number of factors. Establishing the factors implicated in various adoption strategies constituted a concern in the third iteration whereby the theory was generated.

Table 7-7. Core categories of strategy coded from interview data. No Action Strategy Frequency Explanation Operators adopt strategy of using existing Procedure driven 1 156 procedures to select and implement operator strategy behaviors. Operators adopt strategy of using understandings of Domain knowledge operation principles, theories, and dependency 2 221 driven strategy relationships, etc. to select and implement operator behaviors. Operators adopt strategy of using available related Environment resources, e.g. time, humidity, temperature, 3 resource driven 199 distance/ status of equipment, etc. to select and strategy implement operator behaviors. Operators adopt strategy of using predefined target Goal driven equipment status and expected outputs, or avoid 4 87 strategy certain consequences to select and implement operator behaviors. Operators adopt strategy of using experience of Experience driven 5 208 performing similar tasks to select and implement strategy operator behaviors. Operators adopt strategy of using knowledge about Operator status 6 268 themselves or co-workers’ position, behavior, etc. to driven strategy select and implement operator behaviors.

187 Table 7-8. Example of core categories of strategy for use of action knowledge. No Action Strategy Example Procedure driven 13 1 [03-355 ] So [I] go out there, I'll read what the SOP said and strategy locate my valves. [12-323] I blend it for certain time that depending on the volumes Domain and the types of gas that they needed on the pipeline, and other 2 knowledge driven bunch of variables that determine it. (These variables are contained strategy in textbook) Environment [25-48] When the power comes back, you are in the control of 3 resource driven reactor and dryer. strategy [01-450] We have outside temperature gauges on the tanks [to Goal driven 4 check whether there is a leak. [We expect that] there is no steam strategy coming in, cause the overload could cook your plan. Experience driven [04-199] We have high high alarms and high alarms. Quite 5 strategy frequently we will get high alarms on equipment as far as it flows. [03-196] You have to have field people out here [and] the Operator status 6 instrumentation electrical people out here… to make sure that driven strategy everything is going to come on the way they supposed to.

7.5.5.3 Theory iteration 3

In Theory Iteration 2, I noticed that operators may have preferences in regard to adopting a certain category of action in order to perform tasks. Such preferences were analyzed in this theory iteration. New factors that impact operators’ adoption of action were discovered.

An operator’s preference could be affected by various factors, such as the how many years of experience the operator has (Phillips, Klein, & Sieck, 2004). The influence of experience on human behaviors has been discussed in many studies (Kaempf, Klein, Thordsen, & Wolf,

1996; Kobus, Proctor, Bank, & Holste, 2000; Larkin, 1983; Phillips et al., 2004; Simon, 1975).

13 The first number indicates the interviewee, and the second number indicates the lines of interview transcript.

188 The analyses offered in the present research study contribute to this field by investigating a new domain, i.e., that of the processing industries.

In this data analysis iteration, I found that operators with more experience behave differently from those with less experience in terms of certain categories of strategy, i.e., procedure-driven strategy and domain-knowledge-driven strategy). In the rest of core categories, however, there are no significant differences between very experienced and less-experienced operators. Both of the core categories of strategy are shown in Figure 7-8. In this figure, the horizontal axis refers to interviewee ID from 1 to 18 as operators’ experience increases (see Table

7-3 for information about the interviewees). For the sake of simplicity, the rest of the core categories are not shown because they do not evince any significant differences. Figure 7-9 shows that as experience increases, operators adopt fewer strategies in line with predefined procedures to guide their behaviors, but adopt more domain-knowledge strategies to enable their behavior. It should be noted here that domain knowledge includes such aspects as operation principles and chemical formulas.

45.0% 40.0% 35.0% 30.0% 25.0% Procedure driven 20.0% strategy 15.0% Domain knowledge 10.0% driven strategy 5.0% 0.0% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Interviewee ID

Figure 7-9. Individual frequency of procedure-driven strategy and domain-knowledge-driven strategy.

189 It is difficult to draw any conclusion based solely on the two patterns derived from the visual measurement shown in the line chart in Figure 7-4. I, therefore, decided to complement the qualitative analysis with a quantitative analysis: I used a T-test analysis to investigate the above two patterns of operators in this research in regard to the patterns of strategies adopted by operators relating to the use of action knowledge.

I understand that using additive models to evaluate data in the grounded theory method is always a problematic matter (Borgatti, 2005). This is because the grounded theory method takes a case rather than a variable perspective. Researchers take different cases to be wholes in which the variables interact as a unit in complex ways to produce certain outcomes. Researchers have few or no controls on the impact from non-focal factors. However, in this theory iteration, I used simple classifications and summary statistics to confirm, explore, or refute observed inferences from Figure 7-8 (the impact on strategies used by very experienced operators). I did not try to prove a position through a statistical approach. Nor did I desert the principles of interpretive grounded theory. For example, in Gasson’s recent study, a quantitative analysis within a grounded theory method was employed in which a correlation analysis was used to verify the interaction between students’ online collaboration roles and their degree of engagement (Gasson

& Waters, 2013).

To conduct the T-test, I separated the all the interviewees (18 in total) into two groups based on their length of experience. Prior research considers operators with field experience of more than seven-years to be experts in the petrochemical industry, whereas all others are considered to be novices (Klein & Hoffman, 1993). Therefore, two operator groups were created in this study: all the interviewees with less than seven years’ experience were assigned to the novice group (Interviewees 1 to 5 on the horizontal axis of Figure 7-8), whereas the expert group comprised the rest of the interviewees, who had at least seven years’ experience (Interviewee 6 to

18 in horizontal axis of Figure 7-8). Next, I compared novice group with the expert group on the

190 basis of the average percentage of each core category. The results of the comparison are shown in

Figure 7-10.

30.0% 25.0% 20.0% 15.0% 10.0% 5.0% average (novice) 0.0% average (expert)

Figure 7-10. Use of different strategies by novice vs. expert operators.

I then used a T-test to identify significant differences between the means of the two interviewee groups (novice vs. expert). The significant differences between novice and expert groups indicate the impaction of operator experience on adoption of strategies. The impact of operator experience on the adoption of the two core categories is shown to be significant at p =

0.10. Specifically, the expert and novice operators behave significantly differently in regard to adopting two strategies related to the use of action knowledge: procedure-driven strategy and domain-knowledge-driven strategy. The two categories are exactly the same as the two patterns in

Figure 7-9. Therefore, the T-test results confirm my findings. A complete inventory of explanations for each core category of strategy is included with a comparison of the results in

Table 7-9.

191 Table 7-9. T-test results and explanations. P- Category Statistics Explanation value Mean (exp) = Expert operators adopt fewer strategy to apply action Procedure driven 0.1050 0.080 knowledge in procedures to select and implement strategy Mean (nov) = behaviors than novices. 0.2252 Mean (exp) = Domain Expert operators adopt more strategy to apply action 0.2046 knowledge 0.080 knowledge of petrochemical industry domain to Mean (nov) = driven strategy select and implement behaviors than novices. 0.1846

Although Figure 7-10 shows a significant difference between the novice and expert operators in regard to adopting operator-status-driven strategy, this core category is not significant in the T-test analysis. This is because the adoption of such strategies varies even within each group. For example, Interviewee IDs 15 and 17 are the lowest and highest points, respectively, in Figure 7-9, yet both belong to expert group. The variance in terms of the adoption of operator-status-driven strategy by both novice and expert operators is shown in the boxplot in

Figure 7-11. For the purpose of comparison, I also included the boxplot of the procedure-driven strategy in the left side of Figure 7-11. The boxes and whiskers of the novice and expert operators included in the left boxplot of the procedure-driven strategy show very little overlap. However, in the right boxplot of the human-resources-driven strategy, the whiskers of the expert operators cover all the novices. With such a large span of variance, a T-test could not suggest that novice operators adopt human-resources-driven strategy at a rate significantly different from that of the experts.

192

Procedure driven strategy Operator status driven strategy

35.0% 60.0%

30.0% 50.0% 25.0% 40.0% 20.0% 30.0% 15.0% 20.0% 10.0%

5.0% 10.0%

0.0% 0.0% Novice Expert Novice Expert

Figure 7-11. Boxplot of Procedure-driven strategy and human-resource-driven strategy between novice and expert operators.

The first core category impacted by operator experience is procedure-driven strategy.

This strategy is designed for the implementation of operator behaviors as predefined in order to complete specific task (Graesser, Woll, Kowalski, & Smith, 1980). In the petrochemical industry, there are plenty of procedures that describe the specific task accomplishments. In this data analysis, I coded a process as a strategy that uses action knowledge in a procedure if the operator behavior was selected and implemented under the instructions of certain procedures or the operator behavior was accessing by the generic script of procedures in the operators’ minds.

Figure 7-9 suggests that operator experience has an impact on the implementation of procedure- driven strategy. As compared to expert operators, novice operators are more likely to adhere to the procedures. Experts, on the other hand, are able to adopt fewer procedure-driven strategies in the given scenario. A T-test analysis of this core category also suggests the same result with a p- value of 0.080. The finding that procedure-driven strategy is one of strategies to apply action knowledge when performing tasks confirms previous research by Klein and Klinger, which claims that if time pressure is extreme, people may carry out predefined actions as a way of

193 reacting without considering alternatives or assessing probabilities (Klein & Klinger, 1991). In this research study, I presented a critical incident scenario, within which limited time and few resources are available to support operator behaviors. When the time pressure was extreme, operators might decide to carry out the first choice of actions without considering alternatives or assessing probabilities. The first choice of novice operators could be to refer to procedures to seek available actions. However, compared with novice operators, expert operators possess more information and experience other than action knowledge in procedures and have more options of strategies to enable their behaviors by using various formats of action knowledge (such as information from other operators, or experience of prior similar tasks) (Lam & Kirby, 2002;

Tjortjis & Layzell, 2001).

The second core category impacted by operator experience is domain-knowledge-driven strategy. In the petrochemical industry, the contents of procedures mainly focus on instructions describing how to implement actions in order to perform a task. In addition to consulting procedures, it may be necessary for operators to refer to other documents when implementing operator behaviors. Such documents include plant blueprints, training manuals, and memos written operators. I defined action knowledge embedded in these documents as domain knowledge. This action knowledge could be expressed as but not limited to chemical formulas, recipes of materials for reactions, or equipment information. In general, domain knowledge is gained by operators during the training process. In the descriptions of interview transcripts, operators mentioned that they access the generic scripts of these documents in their mental models when using domain knowledge to guide their behaviors. In some cases, if the operators aren’t sure about their memory, when time allows, they also check the physical documents in order to gain or affirm domain knowledge. This core category of action strategy, as suggested by the interview data, is also impacted by operator experience. Novice operators use less domain knowledge in performing tasks than expert operators do. A T-test analysis of this core category

194 confirms the conclusion with a p-value of 0.080. To the best of my knowledge, there is no research from the action knowledge perspective that distinguishes the knowledge from procedures (used when performing tasks) and that from other documents (learned during training). Therefore, of course, there isn’t any research that shows the impact of experience on the adoption by operators of domain-knowledge-driven strategy. There may be several reasons causing this impaction for this: First, expert operators have a more complex mental model and better understanding of petrochemical productions and operations than novices do. In particular, expert operators have more domain action knowledge than do novices to drive such strategies when selecting and implementing operator behaviors. Second, expert operators may be more likely to considering underlying theories and rationales (domain knowledge) when selecting and implementing operator behaviors, whereas novices may just follow the instructions. Or, a third reason may be that compared with novices, expert operators are more confident in regard to implementing operator behaviors that deviate from procedures. And, operator behaviors that deviate from procedures are the products of reasoning about domain knowledge. To establish the rationales underlying this phenomenon, however, more research is necessary.

An important finding of this research is that I separated the action knowledge from the procedures and other documents (such as operation logs, memos, training materials, and textbooks), although knowledge in both procedures and other documents are in the explicit formats. I don’t see the same distinction made in previous research. It is not difficult to realize, as shown in Figure 7-7, that if all operator behaviors guided by these two formats of explicit action knowledge are closely associated, no pattern could be revealed in this analysis. However, with this distinction, the impact on both the core categories of strategy is significant.

Table 7-10 summarizes the findings from Theory Iteration 3, which focuses on the impact of operator experience on the adoption of the two core categories of action-knowledge-driven

195 strategy. In this table, I list the ways in which operator experience has an impact on the two core categories and how these relate to findings in the literature.

Table 7-10. Impact of operator experience on adoption of strategy. Pattern from data Related to Prior Work Expert operators adopt fewer strategies to apply action knowledge in Consistent with (Klein procedures to select and implement operator behaviors than novice & Klinger, 1991) operators. Expert operators adopt more strategies to apply domain action knowledge to select and implement operator behaviors than novice Not in prior work operators.

At the end of this theory iteration, I concluded that individual operators have preferences in regard to adopting strategies for using the action knowledge available from certain resources.

These preferences are influenced by their experience. This realization enabled me to formulate iteration 3 of the theory, which answers the initial research question.

7.6 Discussion of Findings

In this section, I summarize the findings from the three iterations, including a consideration of how they answer my initial research question, their implications for future research and practices, and my reflections about my research thus far.

7.6.1 Summary of Findings

The major finding of the work reported in this essay is the discovery of the strategies that operators adopt during work practice, i.e., as they use different sources of action knowledge.

196 These strategies reveal how operators use various forms of action knowledge to guide their decisions and as a basis for operator behaviors. The theory describes and attempts to explain how operators use (as well as generate) action knowledge when interacting with the environment and with other operators to carry out their tasks. A succinct statement of the tentative theory generated from this research is as follows:

Operators engage in dynamic behaviors as part of their work practice to carry out their operational tasks. They use six forms of action knowledge: procedures, domain knowledge in documents other than procedures, predefined goals, previous experience relating to similar tasks, the status of environment resources, and their own status as an operator and the status of co-workers. They use six strategies to apply action knowledge from these sources: procedure-driven strategy, domain-knowledge-driven strategy, experience-driven strategy, goal- driven strategy, environment-resource -driven strategy, and operator-status- driven strategy. Operators demonstrate distinct preferences in regard to certain strategies, which are influenced by experience. With increasing experience, they appear to be less likely to adopt a procedure-driven strategy in favor of selecting a more domain-knowledge-driven strategy.

The final theory generated in this research is also presented in Figure 7-12 (a comparison with Figure 7-2 shows the outcomes as direct responses to the research concerns identified).

197

Goal – Tacit Action Environment – Knowledge Experience Operator

Possesses Used Strategies to apply action knowledge  Procedure driven strategy  Domain knowledge driven strategy Operator  Environment resource driven strategy Used  Goal driven strategy  Experience driven strategy Codified Other  Operator status driven Procedures Documents strategy

Explicit Action Knowledge Operator Behaviors to Carry Out day to day Tasks (e.g. - Cognitive: monitor pump pressure - Physical: close water valve)

Outcomes - Correct performance Reflection Behaviors of tasks, e.g. the plant is to generate new secure knowledge - Human errors, e.g. pressure is too high caused by a pump closed Figure 7-12. Summary of findings to address research gaps.

It is possible to represent the resulting theory by mapping it onto ideas related to action knowledge and further elaborating the foundation provided by Bera and Wand (2009). Table 7-11

198 shows the emergent theory in this format, including several constructs and various formats of action knowledge.

Table 7-11. Constructs and action knowledge for the theory of action strategy. Construct Explanation Trigger The events that lead to operator behaviors. Goal The purpose, goal, or target states that motivate operator behaviors. Environment The domain, the set of all statements about the context (background), which includes but is not limited to the equipment status. Actor The operator(s) who possesses knowledge and who select and implement operator behaviors. Strategy Strategy describe the process that actor use action knowledge to guide decision and implementation of operator behaviors. Procedure driven Operators adopt strategy of using existing procedures to select and strategy implement operator behaviors. Operators adopt strategy of using understandings of operation Domain knowledge principles, theories, and dependency relationships, etc. to select and driven strategy implement operator behaviors. Environment Operators adopt strategy of using available related resources, e.g. resource driven time, humidity, temperature, distance/ status of equipment, etc. to strategy select and implement operator behaviors. Operators adopt strategy of using predefined target equipment status Goal driven and expected outputs, or avoid certain consequences to select and strategy implement operator behaviors. Experience driven Operators adopt strategy of using experience of performing similar strategy tasks to select and implement operator behaviors. Operators adopt strategy of using knowledge about themselves or co- Operator status workers’ position, behavior, etc. to select and implement operator driven strategy behaviors. Operator behavior Operators select and implement operator behaviors as the result of adopting strategy of using action knowledge. Outcome The results of operation behaviors. Operators restructure existing experiences and knowledge from Reflection behavior implemented operator behaviors by retrospectively thinking.

In the literature review section, I argued that prior research focuses either on operator behaviors in isolation, leading to a failure to understand the underlying knowledge that support the implemented actions or on codified knowledge such that little attention is paid to how such knowledge is used in practice. There is a dearth of studies exploring the theory of using action

199 knowledge as strategies in which individuals actually use action knowledge as a foundation to guide their selection and implementation of behavior.

In particular, few efforts have focused on how action knowledge supports operator behaviors. The tentative theory presented in this study, therefore, provides a novel perspective for studying how individuals adopt strategies for using action knowledge to guide their behaviors. I argue that findings from this new perspective can be coordinated with other approaches to investigate human behaviors, such as decision-making models (Klein & Calderwood, 1991;

Noble, Boehm-Davis, & Grosz, 1986; Phillips et al., 2004). For example, Klein proposes the recognition-primed decision (RPD) model, by which decision makers can use mental simulation to rapidly select a reasonable option (Klein & Calderwood, 1991). It is possible to generate greater understanding of decision-making processes following the constructs I have suggested.

More specifically, how operator experience can influence the adoption of different strategies is an important outcome from my research. This linkage can be further explored in empirical studies to shed light on how to guide both effective operator behaviors in context as well as the generation of knowledge. Clearly, frameworks such as those offered by Nonaka (1994) will become relevant in this context with an interplay between the individual and the organizational levels.

7.6.3 Reflections on the grounded theory method

As argued in the justification, I selected a grounded theory approach for this study because little prior work has examined the specific aspect of operators’ strategies for using action knowledge. I followed Strauss and Corbin’s (1990) method to analyze and derive a grounded theory to understand strategies adopted by operators by combining tacit and explicit action knowledge. Some of my processes diverged from the typical grounded theory method approach:

200 for example, my use of visual techniques to suggest influence factors and my use of quantitative data to supplement lacunae in the qualitatively generated theory.

The initial data analysis was qualitative. However, I used qualitative, interpretive methods to analyze the data. This suggested the initial coding schema (the core categories and sub-core categories of action) for later verification and exploration. Then I supplemented the qualitative data collected from organizations in the same industry (a second visit to a different organization in the petrochemical industry) and organizations in a new industry (a third visit to an organization in the chemical industry). A deeper understanding suggested factors that influence operator adoption of certain core categories of strategy. To identify and verify these factors, I sought another analysis method to complement the grounded theory techniques. This complementary analysis method included visualizations that suggest operator experience as the influencing factor and quantitative analysis of core categories of action strategy interacting with operator experience. Each of these data analyses were conducted using a new theory iteration.

This series of iterations resulted in a theory grounded in a specific task performance context in the processing industries with a specific type of subject (operators). By devising a theoretical sampling strategy on the basis of emergent constructs, I achieved theoretical saturation by revisiting the analysis of samples in regard to emergent relationships and by undertaking selective data collection until the theoretical constructs were consistent across the data sampled

(Gasson, 2004). I employed a theoretical sampling strategy whereby inconsistencies in the emergent theory are sought in order to suggest new insights, as discussed above. These processes involved different subsets of operators with similar backgrounds (at least one year of field experience in the petrochemical industry) operating in a similar task performance environment

(all were given the same critical incident scenario). The outcome is stated as a grounded theory, intended to be transferable to similar populations and operation contexts. By exploring the findings across other populations, transferability can be enhanced (via supplemental analysis

201 across additional sites, e.g., the third visit to and data collection from the company in the chemical industry) and by relating this grounded theory to formal theories of human behaviors and action knowledge in other research streams and domains.

7.7 Conclusions

In this research, I investigated how operators use action knowledge to perform operational tasks in processing industries. My goal was to determine and understand the patterns whereby they adopted strategies to use action knowledge. The study followed a grounded theory method (Eisenhardt, 1989; Strauss & Corbin, 1990; Yin, 2009) and drew on studies related to action knowledge research (Bera & Wand, 2009; Caldwell & Garrett, 2007) and operator behaviors (Rasmussen, 1986; Tasca, 1989). Core categories of strategies were developed based on interview data. Further, the influence of operator experience on the adoption of strategies was investigated.

I claim that several potential benefits follow from the findings reported herein. First, the six resources of action knowledge consulted by operators are identified. Second, the theory offers a basis for understanding which strategies are adopted by operators to perform operational tasks and which action knowledge they use via these strategies. Third, the influence of operator experience on certain categories of action-knowledge-driven strategy found in this study connects operator behavior and operator background in the domain. It was evident that the operators drew on action knowledge in a dynamic way. It is likely that the patterns according to which operators used action knowledge are by several factors. The extent of their experience is just one influencing factor found in this study. Given the influences of this factor, industry players could

202 improve training and mentoring programs based on the strategy-adoption patterns and preferences of novices and expert operators as appropriate to meet related goals.

I acknowledge that this study has some limitations. First, although a broad sampling of task-performing cases for different goals in different environments including those provided by various organizations is desirable at this early stage of theory building, the number of cases is relatively limited. Second, although all the cases studied suggest several strategies for using action knowledge to guide operator behaviors, discovering the factors other than experience that influence strategy adoption will require further investigation.

To advance theory and increase generalizability, a number of future research directions are suggested. In respect to the explanatory theory, future research could productively focus on theoretical replication across cases. Specifically, additional empirical work is needed to examine similarities and differences across organizations within the same industry and across a wide spectrum of processing industries. These steps would advance our understanding of the distinctive characteristics that may be uniquely tied to specific industries or markets, thus helping to build a more substantive theory. Likewise, more work is also needed to explore the influences on and relationships between specific core categories of strategy and numerous characteristics at the organizational level. Another key area for investigation is that of determining how knowledge is managed in organizations to influence operators to adopt strategies. Finally, controlled empirical studies are needed to explore outcome variables such as the relative effectiveness of work practice based on the tentative theory suggested. Such studies would advance both theory and practice related to the management of action knowledge in the process industry.

Chapter 8

Concluding Remarks

In this final chapter, I recap work I have done, outline my contributions in response to the original research question (and sub-questions), point out implications of the work for practice, and acknowledge limitations that can provide the basis for future work.

8.1 Returning to the Research Question(s)

In this dissertation, I reported research to answer the question – How should explicit and tacit action knowledge (at both the individual and the organizational levels) be leveraged and managed in process industry? And the research domain of investigation is the petrochemical industry. In the literature review, I argued that operational knowledge should be conceptualized as “Action Knowledge”, which may be tacit or explicit. I also argued that the approaches to leverage and manage this knowledge would, therefore, require both, organizational and individual perspectives. I investigated three sub-questions to answer the overall research question. These sub-questions were:

1. How can explicit action knowledge embedded in procedures be extracted and chunked such

that knowledge of this nature can be managed as an organizational asset?

2. Is the chunking process designed for explicit action knowledge effective? And

3. What strategies do operators use to apply action knowledge – both tacit and explicit - as they

engage in the performance of tasks?

The results were reported in this dissertation in the form of three research essays. Essay 1 and Essay 2 are aimed at concerns related to managing and leveraging explicit action knowledge.

204 The research method for these two sub-questions was design science research. In Essay 1, In

Essay 1, I described a heuristic approach to extract and chunk the explicit action knowledge from standard operating procedures in the petrochemical industry. The heuristic approach contains three phases to extract and chunk explicit action knowledge. The essay describes the outcomes in the form of a design science artifact consisting of contributions from kernel theories, the meta-requirements, and the meta-design for the artifact (Gregor & Jones, 2007; Takeda et al.,

1990). The artifact was implemented as a research prototype. The essay also describes the architecture and component technologies used and created for instantiation of the designed approach named SPA and elaborate the justificatory knowledge embedded in the artifact.

Essay 2 describe the design and outcomes of the evaluation aimed at assessing effectiveness of the design science research output of Essay 1. The evaluation includes a demonstration of SPA, a formative evaluation to obtain user feedback, and a summative evaluation effort based on assessment by expert operators as well as measurement of outcomes such as the accuracy of extracted outcomes. The results show that the accuracy rate increases from 56% to 71% as a result of the learning mechanisms. The assessment by expert operators confirmed that the tool generates appropriate chunks (contributing to 54% of the final chunks are from the outputs suggested by SPA). The evaluation provided a preliminary indication of the effectiveness of the extraction and chunking process.

In Essay 3, I investigated strategies operators use to apply action knowledge from all different resources (even as they follow the instructions, i.e. action knowledge codified in the operating procedures) to perform tasks in the petrochemical industry. I used grounded theory method for this investigation. The goal in this research was to advance our understanding of action knowledge possessed and utilized by human individuals. An outcome of the data analysis was a new model that explicates how operators use action knowledge – including tacit and explicit – to guide their behaviors. It also shows how operators with different experience levels

205 behave differently as they draw upon these sources of knowledge. The analysis demonstrates that both tacit and explicit action knowledge are brought together at the point of work by operators to carry out their tasks. The findings have the potential for integration across essays, e.g. operator strategies for using action knowledge in procedures (essay 3) could be supported by the SPA approach (essays 1 and 2).

8.2 Contributions

The contributions of this research come from both theoretical and practical aspects. For each essay, I claim several potential contributions.

In Essay 1, the work presented can be viewed in the context of the guidelines suggested for design science research (Hevner et al., 2004). A design theory is developed via a heuristic approach to extracting and chunking action knowledge from procedures in the petrochemical industry. Design requirements are derived from different aspects, including the requirements of the industry domain and kernel theories from the field of action knowledge, as well as related literature on human cognition. In particular, it produces two inter-related artifacts. The first is a heuristic approach to extract and chunk the explicit organizational knowledge from standard operation procedures. The second artifact is a software instantiation named SPA, which implements this heuristic approach. The artifacts expose fundamental properties of each procedure instruction to reveal the structure inherent in each, and leverage these properties to extract action knowledge chunks. Then the design theory is evaluated and the results are reported in Essay 2.

There are some benefits of the design theory and its evaluation. It is an application which demonstrates effectiveness of design theory guidelines and principles (Gregor & Jones, 2007;

Takeda et al., 1990). Second, the design theory is based on existing concepts. It shows benefits of

206 reusing existing approaches and technologies in a new domain to solve new problems. In this essay, heuristic approach, part-of-speech tagging, and learning mechanism are leveraged due to their roots on research in computer science domain and their successful application for natural language processing (Charniak, 1997). Third, multiple evaluation approaches (Cleven et al.,

2009; Hevner et al., 2004) are conducted to demonstrate utility, efficacy, and users’ satisfactions of the design theory.

In addition, my study also contributes to the development of understanding on operators’ strategies to apply action knowledge. The result advances our theoretical understanding of human behaviors from the perspective of action knowledge by enlightening two issues. First, it tells that action knowledge used to support operator behaviors may be have various sources and in different formats. This understanding allows us to contribute new lessons, which may be used to build theory related to managing action knowledge in organizations. Furthermore, it provides a new research orientation on studies of operator behaviors.

In practice, the SPA tool enables managing action knowledge embedded in procedures at the organizational level as a knowledge asset. Similar to a database, the knowledge base containing chunks extracted by SPA can be free of redundancies and easier to store, access and manipulate. Second, the action knowledge chunks can be used to deliver meaningful operational knowledge to support task performance from each operator at the point of action. The action knowledge chunks can provide more opportunities for tailoring the presentation to operators with different expertise levels. Third, the storing of action knowledge chunks suggests the potential for reusing codified action knowledge for the purpose of generating new procedures when operations change, while maintaining trace information and consistency across procedures. Fourth, the action knowledge chunks can be examined to determine the most frequent or the most critical action knowledge, which may then be incorporate into training program to hasten the move from novice to expert operators. Overall, all these benefits suggest the potential to enhance the use,

207 reuse, and management of procedures by processing organizations, improve the operational efficiencies by making appropriate knowledge available to the operators, and promote the growth of novice operators to expert status.

Besides, strategies to apply various resources of action knowledge discovered in this study evident that the operators drew on action knowledge in a dynamic way. The patterns according to which operators used action knowledge are influenced. Given the influences of this factor, industry players could improve training and mentoring programs based on the strategy- adoption patterns and preferences of novices and expert operators as appropriate to meet related goals.

8.3 Limitations

I acknowledge that like any research project, there are some limitations to this research as well. The limitations come from two major aspects: the selection of guiding theories and scope; the implementation of each research essay.

The fundamental theory that underpins this study is knowledge and knowledge management. I follow the action view of knowledge, and adopt two research perspectives of knowledge – tacit vs. explicit, and individual vs. organizational. The limitations I state are generated from these selections. First, the emphasis on action knowledge that answers “how” questions may lead the ignorance of declarative knowledge that is about “what” and “why” questions. Another approach may be to consider both action knowledge and declarative knowledge and how they may be integrated to support task performance in processing industries.

Second, there are multiple classifications on knowledge research summarized in the literature review chapter. Although the two – tacit vs. explicit and individual vs. organizational – are adopted as the research perspectives in this study, other knowledge classifications may still

208 provide alternative approaches to investigate action knowledge, and could be the potential new research perspectives. Third, for the research perspective of individual vs. organizational levels, the research objectives are defined as individual operators and refinery organizations. However, there are different organization hierarchies other than the individual and organization levels as the research object candidates for further research, such as the operating teams. Fourth, I used the petrochemical industry as the research setting. Although it provided me with a rich domain for research, it may be possible to question generalizability of findings in this study across domains / different industry sectors. More specific limitations can be identified as follows.

Limitations generated during the research implementations are discussed in essays. In

Essay 1, the limitations include the following: first, the adoption of heuristic approach may have the limitation of too much reliance on subject-matter experts to create a taxonomy; non- comprehensiveness of heuristics; and a focus on action knowledge structured as predicates, subjects, objects, and conditions may lead operators to ignore descriptive or peripheral knowledge that may be implicitly embedded in instructions. A possible solution could be to supplement the action knowledge with a conceptual map of the refinery, the rationale for descriptive non-action knowledge, and similar task implementation experience from experts (tacit individual action knowledge) that enhances the operators’ understanding. These remain part of future research.

In Essay 2, the limitations include the following. The training procedures may be obtained from a larger spectrum of organizations within the petrochemical industry. This will, however, present problems related to different formats and challenges related to pre-processing.

Second, because of limitations on time and resources, participation from multiple domain experts was difficult to obtain. I attempted to overcome this based on participation from multiple operators. As the SPA tool is widely implemented and tested in various petrochemical companies in future, more testing data can be obtained to increase generalizability of the evaluation results.

209 In Essay 3, the limitations include the following. It may be possible to critique the use of interviews as the primary data collection method. Participant observations are, however, difficult to accomplish in refineries where access is limited. Second, although I explored the influence of operator experience on behaviors, there may be other factors that impact operator behaviors.

These may need further exploration.

8.4 Implications for Future Research

This study provided an investigation of the action view of knowledge, dealing with the knowledge classifications of tacit vs. explicit, and individual vs. organizational. The implications for future research, therefore, focus on the knowledge and knowledge management domain. From the findings in this study, I suggest three directions for future work: (1) exploring and improving more refined chunking strategies for the knowledge contained in operating procedures, (2) comparing SPA with other tools used for similar purposes, and exploring how it may help in other industries, (3) exploring and verifying action strategies adopted by operators to carry out of operator behaviors, and (4) linking action knowledge in different formats, and connecting action knowledge and declarative knowledge to support task performance.

First, as discussed in the Limitations section, the chunking strategy of this study can be extended. First, given the importance of knowledge chunks in operators’ task performing process, the appropriate amount of action included in a knowledge chunk could be decided by considering operators’ capability limitation of memory and execution (Baddeley, 1992; Wouters et al., 2008).

Second, future work could track the effects of reuse of action knowledge chunk. The current heuristics generate knowledge chunks from individual procedure. The tool of SPA provides function of discovering common instructions between multiple procedures. Therefore, creating new knowledge chunks from common actions could be a possible next step.

210 Second, to supplement the evaluation of SPA, two additional issues require efforts in future. The first is to compare SPA with other systems with similar purpose. For example, there are several procedure management systems in healthcare industry that decompose pathway and diagnose procedures based on events (DeBusk, Cofer, Shanks, & Lukens, 1999). Diagnostic systems are also applied in nuclear industry that use knowledge engineering technique to diagnose failure on details and provide advices to operators (Kim, 1994). I have mentioned three example procedure management systems currently adopted in process industries, including

SharePoint (Gilbert et al., 2009; Oleson, 2007), PolicyStat (Hall, 2014), and OnPolicy (Anderson,

2013). It is possible to compare SPA with these systems for functionality and feasibility. A corollary to these is the second issue: explore the possibility of using SPA in different industries and organizations. The potential industries other than process industry include healthcare industry, nuclear power industry, emergency response etc. SPA may be suitable for these industries and organizations because they are similar to the petrochemical industry in many ways such as (1) action knowledge plays on important role for directing behaviors in these industries

(Rochlin, La Porte, & Roberts, 1987); (2) there are several procedures containing action knowledge in these industries (Baker, Day, & Salas, 2006; Bigley & Roberts, 2001); and (3) a system solution is required to extract and manage action knowledge from procedures (Roberts,

Bea, & Bartles, 2001; Roberts & Rousseau, 1989).

The third research aim of this project was to explore tacit action knowledge. Implications on future research extending this work suggest two directions. First, the outcomes of this investigation reveal that operator experience in one factor that impacts their behaviors. In future research, efforts could be spent to discover factors other than operator’s experience. Possible factors include education level, working roles, personalities of operators, organization environment, etc. Second, as aforementioned, outcomes of this research are grounded on

211 interviews. It may be possible to replicate these and add triangulation to corroborate these conclusions.

The results of my analysis clearly highlight how action knowledge (explicit and tacit) may be managed. First, the continued enhancement of action knowledge extraction from both procedures and operators in this study is critical to the better understanding of the relationships between explicit organizational and tacit individual action knowledge. It is important to continue efforts to investigate how the two formats of action knowledge transferred and integrated to support operator behaviors. Second, although the two perspectives of knowledge introduce four formats of action knowledge, only two of them are studied in this research (see Figure 3-1). I believe that it will be important to develop new studies on the other two forms – explicit individual action knowledge and tacit organizational action knowledge. Third, the wide range of current research of knowledge management on action knowledge provides a strong basis for establishing the link between action knowledge and declarative knowledge. It is necessary to explore how action knowledge and declarative knowledge work together to support actions.

8.5 Implications for Use in Practice

I discuss the implications of this study by addressing on the research domain of petrochemical industry, and then extending to all other processing industries.

8.5.1 For Petrochemical Industry

The implications of this study in petrochemical industry are considered based on the feedback from industry experts, which include following aspects: (1) improving SPA tool to fit in

212 refineries, (2) exploring possibilities for improving procedures, and (3) exploring training of novice operators for specific behaviors.

First, the current SPA tool enables users to separate different parts of procedures, pre- process and extract fundamental properties of each instruction, cluster instructions into knowledge chunks, and display graph of chunk commonalities. Besides, the tool encourages users’ participations and contributions with the tool, such as making modifications during the chunking process, comparing commonality across multiple knowledge chunks, and storing and researching procedures by refineries, tasks, or equipment locations. In order to improve this tool for fitting for certain refineries, existing limitations and barriers of development must be addressed. First, domain expertise will be obtained to create ontologies for heuristic supporting. The domain expertise includes checking or waiting interruptions, equipment locations in plant, and operator role lists. Second, to customize this tool for each refinery, a training process on the tool with existing procedures from the same refinery is necessary. Based on the evaluation in Chapter 6, the accuracy trend of pre-processing becomes stable after training of 50 procedures. Before launch the SPA in refineries, a training process following the same procedure is needed. Third, I recommend efforts on the presentation and visualization of knowledge chunks. So far, the heuristic based tool successfully delivers knowledge chunks extracted from procedures. Based on these knowledge chunks, a tailored view of procedures can be created, according to operators’ experience, background, and their pre-defined setting, to present with different levels of details.

For example, a broad view of a procedure only contains the name of knowledge chunks, while a detail view present instructions of every single actions. Operators can decide the view of a procedure when referring it on the tool. Forth, integrating SPA with existing system in refineries is a concern. There are monitor and control systems used in petrochemical refineries. Operators are able to obtain real-time operational data and change status of equipment via existing systems.

213 A good application of SPA could be connecting the procedure chunking with the operation systems, and triggering related procedures the real-time operational data and equipment status.

Other than the implication of procedure chunking tool, this study of action knowledge management provides opportunities to improve procedures in petrochemical industry. First, the procedure creation process will be improved. Nowadays, senior operators and engineers write down the whole process of how to perform a certain task, and then put it into a procedure template defined by the refinery. This process will be changed with the applying of SPA in refineries. Once the knowledge chunks have been extracted, they are stored in knowledge-base with chunk names. Procedure creators can easily search and reuse existing chunks when the same action knowledge is using in new procedures, instead of generating procedure instructions line by line. Second, revising procedures by extracting and comparing similar action knowledge chunks is another implication of this study. With the comparison function of SPA, users are able to find action knowledge chunks with high similarity. The differences among similar action knowledge chunks suggest possible action instructions that are overlooked or inaccurate. Then the inappropriate procedures can be modified based on these extracted similar knowledge chunks.

Third, the tacit action knowledge study in Essay 3 provides opportunities to extract operator expertise and codify it into procedures. When creating new procedures, operators or engineers can use the critical incident technique to recall the task-performing process in real situation.

Furthermore, training strategies of operators is enlightened by this study. First, refinery should pay attentions on all formats of action knowledge that support operators’ action strategies to carry out operator behaviors. For those action knowledge which is difficult to possess or can easily be ignored, it is necessary to emphasize them in training process. Second, factors impacting on operators’ action strategies, such as experience, also should be taken into account by processing industries when training operators. Considering the various preferences on action knowledge adoption to guild operator behaviors between novice and expert operators, it is

214 necessary to design the specific training plan for operators with different level experience. Third, how the operator connects practice and procedures could be a criterion to judge whether the training of a processing organization is effective and how mature the operator is. A good training plan should encourage operators transfer knowledge in procedures into their own understanding and practice.

8.5.2 For Other Process Industries

The findings of this research also implicate future work in all other processing industries.

The implications are indicated as two aspects: extensions of SPA and founded action strategies of operators.

Procedures play an important role in all process industries to instruct operator behaviors.

The platform of SPA could be reused to manage procedures after elaborate modifications.

Modifications on the tool include redesigning chunking heuristics with domain knowledge and considerations on characteristics of certain industries, training with appropriate and sufficient procedures from target process industries, and integrating the SPA with existing management and control systems. Besides, action strategies to guild operator behaviors could be revised to adapt to new processing industries. Although some details of them may be slightly different, the action strategies of referring eight formats of tacit and explicit action knowledge in petrochemical industry could be the foundation to build new strategies in other processing industries. The research methods and investigation processes in this research also could be referred when designing research in domain of new processing industries.

215 Reference

Ackoff, R. L. 1989. From data to wisdom. Journal of Applies Systems Analysis, 16: 3-9. Ahn, H. J., Lee, H. J., Cho, K., & Park, S. J. 2005. Utilizing knowledge context in virtual collaborative work. Decision Support Systems, 39(4): 563-582. Alavi, M., & Leidner, D. E. 1999. Knowledge management systems: Issues, challenges, and benefits. Communications of the AIS, 1(2): 1. Alavi, M., & Leidner, D. E. 2001. Review: Knowledge management and knowledge management systems: Conceptual foundations and research issues. MIS Quarterly, 25(1): 107-136. Anderson, C. 2013. Bizmanualz New OnPolicy(tm) Procedure Management Software Designed for Speed, Vol. 2015: http://www.prweb.com/releases/2013/2018/prweb10994067.htm. Saint Louis, Missouri: http://www.prweb.com/releases/2013/8/prweb10994067.htm. Anderson, J. R. 1976. Language, memory, and thought. Hillsdale, N.J.; New York: L. Erlbaum Associates ; Distributed by the Halsted Press Division of Wiley. Anderson, J. R. 2009. Cognitive psychology and its implications. New York: Worth. Apostolou, D., Mentzas, G., Klein, B., Abecker, A., & Maass, W. 2008. Interorganizational knowledge exchanges. Intelligent Systems, IEEE, 23(4): 65-74. Argyris, C., & Schon, D. A. 1978. Organizational learning: MA: Addison-Wesley. Attwood, D., & Fennell, D. 2001. Cost-effective human factors techniques for process safety, CCPS International Conference and Workshop. Toronto, CANADA. Baddeley, A. 1992. Working memory. Science, 255(5044): 556-559. Baker, C. L. 1989. English syntax. Cambridge, Mass.: MIT Press. Baker, D. P., Day, R., & Salas, E. 2006. Teamwork as an essential component of high‐reliability organizations. Health services research, 41(4p2): 1576-1598. Barney, J. B. 1986. Strategic factor markets: Expectations, luck, and business strategy. Management Science, 32(10): 1231-1241. Barr, P. S., Stimpert, J. L., & Huff, A. S. 1992. Cognitive change, strategic action, and organizational renewal. Strategic Management Journal, 13: 15-36. Baskerville, R. 2008. What design science is not. European Journal of Information Systems, 17(5): 441-443. Bera, P., & Wand, Y. 2009. A framework to clarify the role of knowledge management systems, Pacific Asia Conference on Information Systems. Hyderabad, India. Berente, N., Baxter, R., & Lyytinen, K. 2010. Dynamics of inter-organizational knowledge creation and information technology use across object worlds: the case of an innovative construction project. Construction Management and Economics, 28(6): 569-588. Bharathy, G. K. 2006. Agent based human behavior modeling: a knowledge engineering based systems methodology for integrating social science frameworks for modeling agents with cognition, personality and culture. Bhatt, G. D. 2002. Management strategies for individual knowledge and organizational knowledge. Journal of knowledge management, 6(1): 31-39. Bigley, G. A., & Roberts, K. H. 2001. The incident command system: High-reliability organizing for complex and volatile task environments. Academy of Management Journal, 44(6): 1281-1299. Blosch, M. 2001. Pragmatism and Organizational Knowledge Management. Knowledge and Process Management, 8(1): 39-47. Bonaceto, C., & Burns, K. 2007. A survey of the methods and uses of cognitive engineering. In R. R. Hoffman (Ed.), Expertise Out of Context Proceedings of the Sixth International Conference on Naturalistic Decision Making. Hoboken: Lawrence Erlbaum Associates.

216 Borgatti, S. 2005. Introduction to grounded theory. Retrieved May, 15: 2010. Braganza, A., Hackney, R., & Tanudjojo, S. 2009. Organizational knowledge transfer through creation, mobilization and diffusion: a case analysis of InTouch within Schlumberger. Information Systems Journal, 19(5): 499-522. Bramer, M. A. 2007. Principles of data mining. London: Springer. Briscoe, G., & Caelli, T. 1996. A compendium of machine learning: Symbolic machine learning. Norwood, NJ: Ablex Pub. Corp. Brodbeck, P. W. 2002. Complexity theory and organization procedure design. Business process management journal, 8(4): 377-402. Bryant, A., & Charmaz, K. 2007. The SAGE handbook of grounded theory. Los Angeles; London: SAGE. Byiers, B. J., Reichle, J., & Symons, F. J. 2012. Single-subject experimental design for evidence- based practice. American Journal of Speech-Language Pathology, 21(4): 397-414. Cai, Y. 2008. Digital human modeling trends in human algorithms. Berlin; New York: Springer. Caldwell, B. S., & Garrett, S. K. 2007. Team-based coordination of event detection and task management in time-critical settings. Paper presented at the Proceedings of the Eighth International NDM Conference, Pacific Grove. Card, S., Moran, T. P., & Newell, A. 1983. The psychology of human-computer interaction. Hillsdale, NJ: Erlbaum Publishing. Charniak, E. 1997. Statistical Techniques for Natural Language Parsing. AI Magazine, 18(4): 33- 43. Chen, Z., & Cowan, N. 2009. Core verbal working-memory capacity: The limit in words retained without covert articulation. Quarterly Journal of Experimental Psychology, 62(7): 1420-1429. Chi, M. T. H., Glaser, R., & Farr, M. J. 1988. The nature of expertise. Hillsdale, N.J.: L. Erlbaum Associates. Chia, R. 2003. From knowledge-creation to the perfecting of action: Tao, Basho and pure experience as the ultimate ground of knowing. Human Relations, 56(8): 953 - 981. Chisholm, R. M. 1982. The foundations of knowing. Minneapolis: University of Minnesota Press. Christopher, M., & Gaudenzi, B. 2009. Exploiting knowledge across networks through reputation management. Industrial Marketing Management, 38(2): 191-197. Church, K. W., & Mercer, R. L. 1993. Introduction to the special issue on computational linguistics using large corpora. Comput. Linguist., 19(1): 1-24. Clark, R. E., & Estes, F. 1996. Cognitive task analysis for training. International Journal of Educational Research, 25(5): 403-417. Clarke, L., & Short, J. F. 1993. Social Organization and Risk: Some Current Controversies. Annual Review of Sociology, 19(1): 375-399. Cleven, A., Gubler, P., & Kai, M. H. 2009. Design Alternatives for the Evaluation of Design Science Research Artifacts, Proceedings of the 4th International Conference on Design Science Research in Information Systems and Technology. Philadelphia, Pennsylvania. Coakes, E. W., Coakes, J. M., & Rosenberg, D. 2008. Co-operative work practices and knowledge sharing issues: a comparison of viewpoints. International Journal of Information Management, 28(1): 12-25. Cohen, D., & Prusak, L. 1996. British petroleum's virtual teamwork program: Ernst & Young Center for Business Innovation. Cohen, E. B. 2006. Information universe : Issues in informing science and information technology. Santa Rosa, CA: Informing Science Press.

217 Cole, R., Purao, S., Rossi, M., & Sein, M. K. 2005. Being proactive: Where action research meets design research. Paper presented at the International Conference on Information Systems. Connell, P. J., & Thompson, C. K. 1986. Flexibility of single-subject experimental . Part III Using flexibility to design or modify experiments. Journal of Speech and Hearing Disorders, 51(3): 214-225. Cook, S. D. N., & Brown, J. S. 1999. Bridging epistemologies: The generative dance between organizational knowledge and organizational knowing. Organization Science, 10(4): 381-400. Cooke, N. J. 1992. The implications of cognitive task analyses for the revision of the Dictionary of Occupational Titles. In W. J. Camara (Ed.), Implications of Cognitive Psychology and Cognitive Task Analysis for the Revision of the Dictionary of Occupational Titles. Washington D.C.: American Psychological Association. Cooper, S., Treuille, A., Barbero, J., Leaver-Fay, A., Tuite, K., Khatib, F., Snyder, A. C., Beenen, M., Salesin, D., Baker, D., Popovi, Z., & #263. 2010. The challenge of designing scientific discovery games, Proceedings of the Fifth International Conference on the Foundations of Digital Games: 40-47. Monterey, California: ACM. Corbett, A. T., & Anderson, J. R. 1994. Knowledge tracing: Modeling the acquisition of procedural knowledge. User Modeling and User-Adapted Interaction, 4(4): 253-278. Cranor, L. F. 2008. A framework for reasoning about the human in the loop. Paper presented at the the 1st Conference on Usability, Psychology, and Security, San Francisco, California. Davenport, T. H., & Prusak, L. 1998. Working knowledge : How organizations manage what they know. Boston, MA: Harvard Business School Press. De, S., Dhar, A., Biswas, S., & Garain, U. 2011. On Development and Evaluation of a Chunker for Bangla. Paper presented at the 2011 Second International Conference on Emerging Applications of Information Technology (EAIT). DeBusk, B. C., Cofer, M. C., Shanks, M. W., & Lukens, W. F. 1999. Modular health-care information management system utilizing reusable software objects. In I. GE Medical Systems Information Technologies (Ed.), US5995937 A. US: Deroyal Industries, Inc. Dreyfus, H. L., Dreyfus, S. E., & Zadeh, L. A. 1987. Mind over machine: The power of human intuition and expertise in the era of the computer. IEEE Expert, 2(2): 110-111. Drucker, P. 1992. The new society of organizations. Harvard Business Review, 70(5): 95-104. Eisenhardt, K. M. 1989. Building theories from case study research. The Academy of Management Review, 14(4): 532-550. Endsley, M. R., & Rodgers, M. D. 1994. Situation awareness information requirements analysis for en route air traffic control. Human Factors and Ergonomics Society Annual Meeting Proceedings, 38: 71-75. Eraut, M. 2000. Non-formal learning and tacit knowledge in professional work. British Journal of Educational Psychology, 70(1): 113-136. Erickson, L. B. 2013. Hanging with the right crowd: Crowdsourcing as a new business practice for innovation, productivity, knowledge capture, and marketing: THE PENNSYLVANIA STATE UNIVERSITY. Ericsson, K. A., & A., S. H. 1993. Protocol analysis verbal reports as data. Cambridge, MA: MIT Press. Ericsson, K. A., & Simon, H. A. 1980. Verbal reports as data. Psychological Review, 87(3): 215- 251. Flanagan, J. C. 1954. The critical incident technique. Psychological bulletin, 51(4): 327-358. Freund, A. M., & Baltes, P. B. 2000. The orchestration of selection, optimization and compensation: An action-theoretical conceptualization of a theory of developmental

218 regulation. Control of human behavior, mental processes, and consciousness: Essays in honor of the 60th birthday of August Flammer: 35-58. Freund, Y., & Schapire, R. E. 1996. Experiments with a new boosting algorithm. Paper presented at the ICML. Friedman, K. 2003. Theory construction in design research: criteria: approaches, and methods. , 24(6): 507-522. Gao, X., & Sterling, L. 2001. Knowledge-based information agents. In R. Kowalczyk, S. Loke, N. Reed, & G. Williams (Eds.), Advances in Artificial Intelligence, Vol. 2112: 229-238: Springer Berlin / Heidelberg. Gasser, L. 1991. Social conceptions of knowledge and action: DAI foundations and open systems semantics. Artificial Intelligence, 47: 107-138. Gasson, S. 2004. Qualitative field studies. The handbook of information systems research: 79. Gasson, S., & Waters, J. 2013. Using a grounded theory approach to study online collaboration behaviors. European Journal of Information Systems, 22(1): 95-118. Gediga, G., Hamborg, K.-C., & Düntsch, I. 2002. Evaluation of software systems. Encyclopedia of computer science and technology, 45(supplement 30): 127-153. Geffner, H., & Wainer, J. 1998. Modeling action, knowledge and control. Paper presented at the 13th European Conference on Artificial Intelligence, Brighton, UK. Georgeff, M. P., & Lansky, A. L. 1986. Procedural knowledge. Proceedings of the IEEE, 74(10): 1383-1398. Gersick, C. J. G. 1990. Habitual routines in task-performing groups. Organizational Behavior and Human Decision Processes, 47(1): 65-97. Gilbert, M. R., Shegda, K. M., Phifer, G., & Mann, J. 2009. SharePoint 2010 Is Poised for Broader Enterprise Adoption: Gartner. Glaser, B. G. 1978. Theoretical sensitivity: Advances in the methodology of grounded theory: Sociology press Mill Valley, CA. Glaser, B. G. 1998. Doing grounded theory: Issues and discussions: Sociology Press. Gold, A. H., Malhotra, A., & Segars, A. H. 2001. Knowledge management: An organizational capabilities perspective. Journal of Management Information Systems, 18(1): 185-214. Goldkuhl, G. 1999. The grounding of usable knowledge: An inquiry in the epistemology of action knowledge, CMTO Research Papers, Vol. 3. Linköping, Sweden. Graesser, A. C., Woll, S. B., Kowalski, D. J., & Smith, D. A. 1980. Memory for typical and atypical actions in scripted activities. Journal of Experimental Psychology: Human Learning and Memory, 6(5): 503-515. Grant, E., & Gregory, M. 1997. Tacit knowledge, the life cycle and international manufacturing transfer. Technology Analysis & Strategic Management, 9(2): 149-162. Grant, R. M. 1996. Toward a knowledge-based theory of the firm. . Strategic Management Journal 17: 109-122. Gregor, S. 2006. The nature of theory in information systems. MIS Quarterly, 30(3): 611-642. Gregor, S., & Jones, D. 2007. The anatomy of a design theory. Journal of the Association for Information Systems, 8(5): 312-335. Ha, D., Minjung, K., Wade, A., Chao, W. O., Ho, K., Kaastra, L., Fisher, B., & Dill, J. 2007. From tasks to tools: A field study in collaborative visual analytics. Paper presented at the IEEE Symposium on Visual Analytics Science and Technology, 2007. VAST 2007. Haavisto, O., & Remes, A. 2010. Data-based skill evaluation of human operators in process industry. Paper presented at the 2010 International Conference on Control Automation and Systems (ICCAS). Hall, E. M., Gott, S. P., & Pokorny, R. A. 1995. A procedural guide to cognitive task analysis: the PARI methodology.: U.S. Air Force.

219 Hall, J. 2014. Health Tech Award winner PolicyStat changes the way organizations capture and disseminate knowledge for better outcomes: http://techpoint.org/policystat/. Hamilton, S., & Chervany, N. L. 1981. Evaluating information system effectiveness - Part I: Comparing evaluation approaches. MIS Quarterly, 5(3): 55-69. Han, K. H., & Park, J. W. 2009. Process-centered knowledge model and enterprise ontology for the development of knowledge management system. Expert Systems with Applications, 36(4): 7441-7447. Hansen, M. T., Nohria, N., & Tierney, T. 1999. What's your strategy for managing knowledge? Harvard Business Review, 77(2): 106-116. Harnesk, D., & Thapa, D. 2013. A framework for classifying design research methods. Design Science at the Intersection of Physical and Virtual Design, Lecture Notes in Computer Science, 7939: 479-485. Hasher, L., & Zacks, R. T. 1984. Automatic processing of fundamental information: The case of frequency of occurrence. American Psychologist, 39(12): 1372-1388. Haunschild, P. R., & Mooweon, R. 2004. The role of volition in organizational learning: the case of automotive product recalls. Management Science, 50(11): 1545-1560. Hawthorne, J., & Stanley, J. 2008. Knowledge and action. The Journal of Philosophy, 102: 571- 590. He, J., Purao, S., Becker, J., & Strobhar, D. 2011. Service extraction from operator procedures in process industries. In H. Jain, A. Sinha, & P. Vitharana (Eds.), Service-Oriented Perspectives in Design Science Research, Vol. 6629: 335-349: Springer Berlin / Heidelberg. Heath, H., & Cowley, S. 2004. Developing a grounded theory approach: a comparison of Glaser and Strauss. International Journal of Nursing Studies, 41(2): 141-150. Helfat, C., & Raubitschek, R. 2000. Product sequencing : Co-evolution of knowledge, capabilities and products. Chichester, ROYAUME-UNI: Wiley. Henderson, J., & Cool, K. 2003. Learning to time capacity expansions: An empirical analysis of the worldwide petrochemical industry, 1975-95. Strategic Management Journal, 24(5): 393-413. Henry, R. M., McCray, G. E., Purvis, R. L., & Roberts, T. L. 2007. Exploiting organizational knowledge in developing IS project cost and schedule estimates: An empirical study. Information & Management, 44(6): 598-612. Hevner, A. R. 2007. A three cycle view of design science research. Scandinavian Journal of Information Systems, 19(2): 4. Hevner, A. R., March, S. T., Park, J., & Ram, S. 2004. Design Science in Information Systems Research. MIS Quarterly, 28(1): 75-105. Hevner, A. R. C. S. 2010. Design research in information systems : theory and practice. New York; London: Springer. Hoffman, R. R., Crandall, B., & Shadbolt, N. 1998. Use of the critical decision method to elicit expert knowledge: a case study in the methodology of cognitive task analysis. Human Factors: The Journal of the Human Factors and Ergonomics Society, 40(2): 254-276. Holcomb, T. R., Ireland, R. D., Holmes, R. M., & Hitt, M. A. 2009. Architecture of entrepreneurial learning: exploring the link smong heuristics, knowledge, and action. Entrepreneurship Theory and Practice, 33(1): 167-192. Holsanova, J., Holmberg, N., & Holmqvist, K. 2009. Reading information graphics: The role of spatial contiguity and dual attentional guidance. Applied Cognitive Psychology, 23(9): 1215-1226.

220 Homsma, G. J., Van Dyck, C., De Gilder, D., Koopman, P. L., & Elfring, T. 2009. Learning from error: The influence of error incident characteristics. Journal of Business Research, 62(1): 115-122. Hoon, J., & Derick, S. 1994. Social knowledge as a control system: A proposition and evidence from the Japanese FDI behavior. Journal of International Business Studies, 25(2): 295- 324. Huang, Z., Chen, H., Guo, F., Xu, J. J., Wu, S., & Chen, W.-H. 2006. Expertise visualization: An implementation and study based on cognitive fit theory. Decision Support Systems, 42(3): 1539-1557. Isermann, R. 2006. Fault diagnosis systems an introduction from fault detection to fault tolerance. Berlin; Heidelberg; New York: Springer. Jamieson, G. A. 2002. Empirical evaluation of an industrial application of ecological interface design. Human Factors and Ergonomics Society Annual Meeting Proceedings, 46(536- 540). Jansen, S. 2010. ServiciFi: Service Extraction from Decomposed Software Monoliths in The Financial Domain. http://servicifi.files.wordpress.com/2010/06/serviceextraction.pdf. Jori, A. 2003. Aristotele. Milan: B. Mondadori. Ju, T. L. 2006. Representing organizational memory for computer-aided utilization. Journal of Information Science, 32(5): 420. Kaelbling, L. P., Littman, M. L., & Moore, A. W. 1996. Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4: 237-285. Kaempf, G. L., Klein, G., Thordsen, M. L., & Wolf, S. 1996. Decision making in complex naval command-and-control environments. Human Factors: The Journal of the Human Factors and Ergonomics Society, 38(2): 220-231. Kaplan, B., & Maxwell, J. A. 1994. Qualitative research methods for evaluating computer information systems. In J. G. Anderson, C. E. Aydin, & S. J. Jay (Eds.), Evaluating Health Care Information Systems: Methods and Applications: 45 - 68. Thousand Oaks, CA: Sage. Kim, I. S. 1994. Computerized systems for on-line management of failures: a state-of-the-art discussion of alarm systems and diagnostic systems applied in the nuclear industry. & System Safety, 44(3): 279-295. Klein, G. A., & Calderwood, R. 1991. Decision models: Some lessons from the field. Systems, Man and Cybernetics, IEEE Transactions on, 2(5): 1018-1026 Klein, G. A., & Hoffman, R. R. 1993. Seeing the invisible: Perceptual-cognitive aspects of expertise. In M. Rabinowitz (Ed.), Cognitive science foundations of instruction: 203- 226. Hillsdale, NJ, England: Lawrence Erlbaum Associates, Inc. Klein, G. A., & Klinger, D. 1991. Naturalistic decision making. Human Systems IAC GATEWAY, 2(1): 16-19. Kletz, T. 2001. An engineer's view of human error. Rugby [England]: Institution of Chemical Engineers. Klyne, G., & Carroll, J. J. 2006. Resource description framework (RDF): Concepts and abstract syntax. Kobus, D. A., Proctor, S., Bank, T. E., & Holste, S. T. 2000. Decision-making in a dynamic environment: The effects of experience and information uncertainty. Ft. Belvoir: Defense Technical Information Center. Koch, I., Philipp, A. M., & Gade, M. 2006. Chunking in task sequences modulates task inhibition. Psychological Science, 17(4): 346-350. Koepsell, D. R. 1999. Invited symposium on applied ontology in the social sciences. American Journal of Economics and Sociology, 58(2): 217-220.

221 Konik, T., & Laird, J. E. 2002. Hierarchical procedural knowledge learning through observation using inductive logic programming: http://www.site.uottawa.ca/~stan/ilp2002/wippprs/konik_ilp2002.pdf. Kornfeld, W. A., & Hewitt, C. E. 1981. The scientific community metaphor. Systems, Man and Cybernetics, IEEE Transactions on 11(1): 24-33. Krogstie, J., Sindre, G., & Jorgensen, H. 2006. Process models representing knowledge for action: A revised quality framework. European Journal of Information Systems, 15: 91- 102. Kucza, T. 2001. Knowledge management process model. Otamedia Oy, Espoo, Finland: Technical Research Center of Finland (VTT). Kuechler, B., & Vaishnavi, V. 2008. On theory development in design science research: anatomy of a research project. European Journal of Information Systems, 17(5): 489-504. Lam, L. T., & Kirby, S. L. 2002. Is emotional intelligence an advantage? An exploration of the impact of emotional and general intelligence on individual performance. The Journal of Social Psychology, 142(1): 133-143. Lapre, M. A., & Tsikriktsis, N. 2006. Organizational learning curves for customer dissatisfaction: heterogeneity across airlines. Management Science, 52(3): 352-366. Larkin, J. H. 1983. The role of problem representation in physics. In D. Gentner, & A. L. Stevens (Eds.), Mental Models. Hillsdale, N.J.: Erlbaum. Lee, J. S., & Hsu, P. L. 2003. Remote supervisory control of the human-in-the-loop system by using Petri nets and Java. IEEE Transactions on Industrial Electronics, 50: 431-439. Lehrer, K., & Paxson, T., Jr. 1969. Knowledge: Undefeated justified true belief. The Journal of Philosophy, 66(8): 225-237. Lesperance, Y., & Levesque, H. J. 1995. Indexical knowledge and robot action - A logical account. Artificial Intelligence, 73: 69-115. Logan, G. D. 2004. Working memory, task switching, and executive control in the task span procedure. Journal of Experimental Psychology: General, 133(2): 218-236. Madsen, E. S., & Mikkelsen, L. L. 2012. Productivity? Domain complexity vs. tacit knowledge, Det Danske Ledelsesakademi. Copenhagen. Madsen, P. M. 2009. These lives will not be lost in vain: Organizational learning from disaster in U.S. coal mining. Organization Science, 20(5): 861-875. Maiti, S., Garain, U., Dhar, A., & De, S. 2013. A novel method for performance evaluation of text chunking. Language Resources and Evaluation: 1-12. Makins, M. 1991. Collins English dictionary: HarperCollins. Malhotra, Y. 2000. Knowledge management and virtual organizations. Hershey, PA: Idea Group Publishing. Manning, C. D., Raghavan, P., & Schütze, H. 2008. Introduction to information retrieval: Cambridge university press Cambridge. Manning, C. D., & Schütze, H. 1999. Foundations of statistical natural language processing: MIT press. Manning, C. D. S. H. 1999. Foundations of statistical natural language processing. Cambridge, Mass.: MIT Press. March, S. T., & Smith, G. F. 1995. Design and natural science research on information technology. Decision Support Systems, 15(4): 251-266. Matos, C. 2008. Service Extraction from Legacy Systems, Graph Transformations, Vol. 5214: 505-507: Springer, Heidelberg. Mayer, R. E., & Sims, V. K. 1994. For whom is a picture worth a thousand words? Extensions of a dual-coding theory of multimedia learning. Journal of Educational Psychology, 86(3): 389-401.

222 McLean, J. 2009. Does your organisation know what it knows? The British Journal of Administrative Management, 32. Menzies, T., Turhan, B., Bener, A., Gay, G., Cukic, B., & Jiang, Y. 2008. Implications of ceiling effects in defect predictors, Proceedings of the 4th international workshop on Predictor models in software engineering: 47-54. Leipzig, Germany: ACM. Millner, P., Cochran, T., & Bullemer, P. 1999. Central control rooms and petrochemical plants: costs and benefits, Human Interfaces in Control Rooms, Cockpits and Command Centres, 1999. International Conference on: 142-147. Bath, UK. Minner, S. 2001. Strategic safety stocks in reverse logistics supply chains. International Journal of Production Economics, 71(3): 417-428. Montoni, M., Miranda, R., Rocha, A., & Travassos, G. 2004. Knowledge acquisition and communities of practice: an approach to convert individual knowledge into multi- organizational knowledge advances in learning software organizations. In G. Melnik, & H. Holz (Eds.), Vol. 3096: 110-121: Springer Berlin / Heidelberg. Moore, R. C. 1977. Reasoning about knowledge and action. Paper presented at the Proceedings of the 5th international joint conference on Artificial intelligence, Cambridge, USA. Moore, R. C. 1985. A formal theory of knowledge and action, Formal Theories of the Commonsense World: 319-358. Muggleton, S., & de Raedt, L. 1994. Inductive logic programming: Theory and methods. The Journal of Logic Programming, 19-20(0): 629-679. Müller-Wienbergen, F., Müller, O., Seidel, S., & Becker, J. 2011. Leaving the Beaten Tracks in Creative Work – A Design Theory for Systems that Support Convergent and Divergent Thinking. Journal of the Association for Information Systems, 12(11): 714-740. Munhall, P. L. 2007. Nursing research : A qualitative perspective. Sudbury, Mass.: Jones and Bartlett. Myers, M. D. 1997. Qualitative research in information systems. MIS Quarterly, 21(2): 241-242. Narayanan, S. S. 1997. Knowledge-based action representations for metaphor and aspect. University of California at Berkeley, Oakland, CA. Negrotti, M. 2001. Designing the artificial: An interdisciplinary study. Design Issues, 17(2): 4- 16. Nelson, R. R., & Winter, S. G. 1982. An evolutionary theory of economic change. Cambridge, Mass.: Belknap Press of Harvard University Press. Newell, A. 1982. The knowledge level. Artificial Intelligence, 18: 87-127. Newell, A., & Simon, H. A. 1976. Computer science as empirical inquiry: Symbols and search. Communications of the ACM, 19(3): 113-126. Newman, A. J., Pancheva, R., Ozawa, K., Neville, H. J., & Ullman, M. T. 2001. An Event- Related fMRI Study of Syntactic and Semantic Violations. Journal of Psycholinguistic Research, 30(3): 339-364. Nielsen, J. 1994. Heuristic evaluation. In J. Nielsen, & R. L. Mack (Eds.), Usability Inspection Methods: John Wiley and Sons. Nielsen, J. 1998. Usability engineering. San Diego, CA: Academic Press. Nivolianitou, Z., Konstandinidou, M., & Michalis, C. 2006. Statistical analysis of major accidents in petrochemical industry notified to the major accident reporting system (MARS). Journal of Hazardous Materials, 137(1): 1-7. Noble, D. F., Boehm-Davis, D., & Grosz, C. 1986. Schema-based model of information processing for situation assessment. Arlington, VA: Office of Naval Research. Nonaka, I. 1994. A dynamic theory of organizational knowledge creation. Organization Science, 5(1): 14-37.

223 Nonaka, I. 2005. Knowledge management : Critical perspectives on business and management. London: New York: Routledge. Nonaka, I., Byosiere, P., Borucki, C. C., & Konno, N. 1994. Organizational knowledge creation theory: A first comprehensive test. International Business Review, 3(4): 337-351. Nonaka, I., & Konno, N. 1999. The concept of "Ba": Building a foundation for knowledge creation. In J. W. Cortada, & J. A. Woods (Eds.), Knowledge management yearbook 1999-2000. Boston, MA: Butterworth-Heinemann. Norese, M. 2010. An application of MACRAME to support communication and decisions in a multi-unit project. Group Decision and Negotiation, 20(1): 115-131. Oleson, J. 2007. 7 Years of SharePoint - A History Lesson, Joel Oleson's Blog - SharePoint Land (Microsoft Corporation): MSDN Blogs. Olla, P., & Holm, J. 2006. The role of knowledge management in the space industry: important or superfluous? Journal of Knowledge Management, 10(2): 3-7. Olla, P. H. J. 2006. Knowledge management in the space industry. Bradford, England: Emerald Group Pub. Orlikowski, W. J., & Baroudi, J. J. 1991. Studying information technology in organizations: Research approaches and assumptions. Information Systems Research, 2(1): 1-28. Osman, B. 1998. Classification of knowledge in Islam : A study in Islamic philosophies of science. Cambridge, U.K.: Islamic Texts Society. Paas, F., Renkl, A., & Sweller, J. 2003. Cognitive load theory and : recent developments. Educational Psychologist, 38(1): 1-4. Pang, B., Lee, L., & Vaithyanathan, S. 2002. Thumbs up?: sentiment classification using machine learning techniques. Paper presented at the Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Pearl, J. 1984. Heuristics: Intelligent Search Strategies for Computer Problem Solving: Addison-Wesley. Perrow, C. 1999. Normal accidents : living with high-risk technologies. Princeton, N.J.: Princeton University Press. Pfeffer, J., & Salancik, G. R. 2003. The external control of organizations : A resource dependence perspective. Stanford, Calif.: Stanford Business Books. Phillips, J. K., Klein, G. A., & Sieck, W. R. 2004. Expertise in judgment and decision making: a case for training intuitive decision skills. In D. J. Koehler, & N. Harvey (Eds.), Blackwell handbook of judgment and decision making: 295–315. Oxford, UK: Blackwell. Phillips, P. A., & Wright, C. 2009. E-business's impact on organizational flexibility. Journal of Business Research, 62(11): 1071-1080. Puppe, F. 1998. Knowledge reuse among diagnostic problem-solving methods in the shell-kit D3. International Journal of Human-Computer Studies, 49(4): 627-649. Purao, S., Storey, V. C., & Han, T. 2003. Improving analysis pattern reuse in : Augmenting automated processes with supervised learning. Information Systems Research, 14(3): 269-290. Rasmussen, J. 1986. Information processing and human-machine interaction : an approach to cognitive engineering. New York: North-Holland. Reason, J. T. 1990. Human error. Cambridge [England]; New York: Cambridge University Press. Roberts, K. H., Bea, R., & Bartles, D. L. 2001. Must accidents happen? Lessons from high- reliability organizations. The Academy of Management Executive, 15(3): 70-78. Roberts, K. H., & Rousseau, D. M. 1989. Research in nearly failure-free, high-reliability organizations: having the bubble. Engineering Management, IEEE Transactions on, 36(2): 132-139.

224 Robey, D., & Sahay, S. 1996. Transforming work through information technology: a comparative case study of geographic information systems in county government. Information Systems Research, 7(1): 93-110. Rochlin, G. I., La Porte, T. R., & Roberts, K. H. 1987. The self-designing high-reliability organization: Aircraft carrier flight operations at sea. Naval War College Review, 40(4): 76-90. Romer, P. 1995. Beyond the knowledge worker. World Link(January/February): 56-60. Rubenstein-Montano, B., Liebowitz, J., Buchwalter, J., McCaw, D., Newman, B., & Rebeck, K. 2001. A systems thinking framework for knowledge management. Decision Support Systems, 31(1): 5-16. Sahoo, T. 2013. Process plants : shutdown and turnaround management. Boca Raton, FL: Taylor & Francis. Savage, S. L. 2009. The flaw of averages : why we underestimate risk in the face of uncertainty. Hoboken, N.J.: Wiley. Savova, G. K., Masanz, J. J., Ogren, P. V., Zheng, J., Sohn, S., Kipper-Schuler, K. C., & Chute, C. G. 2010. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5): 507-513. Schar, S. G., & Krueger, H. 2000. Using new learning technologies with multimedia. Multimedia, IEEE, 7(3): 40-51. Schinko, C., Strobl, M., Ullrich, T., & Fellner, D. 2010. Modeling procedural knowledge: A generative modeler for cultural heritage. Lecture Notes in Computer Science, 6436: 153- 165. Scholtz, J. 2004. Usability evaluation. Schon, D. A. 1987. Educating the reflective practitioner : Toward a new design for teaching and learning in the professions. San Francisco: Jossey-Bass. Schraagen, J. M., Chipman, S. F., & Shalin, V. L. 2000. Cognitive task analysis. Mahwah, N.J.: L. Erlbaum Associates. Schragenheim, E., Cox, J., & Ronen, B. 1994. Process flow industry-scheduling and control using theory of constraints. International Journal of Production Research, 32(8): 1867-1877. Schultze, U., & Leidner, D. E. 2002. Studying knowledge management in information systems research: discourses and theoretical assumptions. MIS Quarterly, 26(3): 213-242. Schulz, M., & Jobe, L. A. 2001. Codification and tacitness as knowledge management strategies: an empirical exploration. The Journal of High Technology Management Research, 12(1): 139-165. Scriven, M. 1991. Evaluation thesaurus. Newbury Park, California: Sage Publications. Sein, M. K. 2011. Action design research. MIS Quarterly, 35(1): 37. Setia, P., & Patel, P. C. 2013. How information systems help create OM capabilities: Consequents and antecedents of operational absorptive capacity. Journal of , 31(6): 409-431. Shulman, L. S. 1987. Knowledge and Teaching: Foundations of the New Reform. Harvard Educational Review, 57(1): 1-23. Silva, F. S. C. d., & Agustí-Cullell, J. 2003. Knowledge coordination. Chichester: John Wiley & Sons. Simon, H. 1981. The science of the artificial. Cambridge, MA: MIT Press. Simon, H. A. 1975. The functional equivalence of problem solving skills. Cognitive Psychology, 7(2): 268 - 288. Simon, H. A. 1996. The sciences of the artificial. Cambridge, Mass.: MIT Press.

225 Simpson, J. A., & Weiner, E. S. C. 1989. The oxford english dictionary. Oxford: Oxford University Press. Singer, P. M., & Hurley, J. E. 2005. The importance of knowledge management today. HR Practice, 2(6): 1-3. Smith, E. A. 2001. The role of tacit and explicit knowledge in the workplace. Journal of Knowledge Management, 5(4): 311-321. Stern, R. N., & Barley, S. R. 1996. Organizations and social systems: Organization theory's neglected mandate. Administrative Science Quarterly, 41(1): 146-162. Steward, D. V. 1981. The design structure system: A method for managing the design of complex systems. Engineering Management, IEEE Transactions on, EM-28(3): 71-74. Strahan, A. 2005. Oil Industry in U.S. Needs Engineers, Courts Retirees, Students. Strauss, A., & Corbin, L. 1990. Basics of qualitative research : Grounded theory procedures and techniques. Newbury Park, Calif.: Sage Publications. Suchman, L. A. 1987. Plans and situated actions : the problem of human-machine communication. Cambridge [Cambridgeshire]; New York: Cambridge University Press. Sullivan, B. N. 2010. The role of tacit and explicit knowledge in the workplace. Journal of knowledge management, 5(4): 311-321. Sussman, S. W., & Siegal, W. S. 2003. Informational influence in organizations: an integrated approach to knowledge adoption. Information Systems Research, 14(1): 47-65. Takeda, H., Veerkamp, P., Tomiyama, T., & Yoshikawa, H. 1990. Modeling design processes. AI Mag., 11(4): 37-48. Tasca, L. 1989. The Social Construction of Human Error. State Univ. New York-Stony Brook. Tjortjis, C., & Layzell, P. 2001. Expert maintainers' strategies and needs when understanding software: a case study approach. Paper presented at the Software Engineering Conference, 2001. APSEC 2001. Eighth Asia-Pacific. Toutanova, K., & Manning, C. D. 2000. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger, Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, Vol. 13: 63-70. Hong Kong: Association for Computational Linguistics. Trappey, A. J. C., Hsu, F.-C., Trappey, C. V., & Lin, C.-I. 2006. Development of a patent document classification and search platform using a back-propagation network. Expert Systems with Applications, 31(4): 755-765. Tsoukas, H. 1996. The firm as a distributed knowledge system: A constructionist approach. Strategic Management Journal, 17(Winter Special Issue: Knowledge and the Firm): 11- 25. Tsuruoka, Y., Tateishi, Y., Kim, J.-D., Ohta, T., McNaught, J., Ananiadou, S., & Tsujii, J. i. 2005. Developing a robust Part-of-Speech Tagger for biomedical text. Advances in Informatics, 3746: 382-392. Turner, M. L., & Engle, R. W. 1989. Is working memory capacity task dependent? Journal of Memory and Language, 28(2): 127-154. Urquhart, C., Lehmann, H., & Myers, M. D. 2010. Putting the ‘theory’ back into grounded theory: guidelines for grounded theory studies in information systems. Information Systems Journal, 20(4): 357-381. Vaishnavi, V., & Kuechler, W. 2004. Design research in information systems, http://desrist.org/design-research-in-information-systems. Vaishnavi, V. K. W. 2008. Design science research methods and patterns : innovating information and communication technology. Boca Raton: Auerbach Publications.

226 van Daal, B., Haas, M. d., & Weggeman, M. 1998. The knowledge matrix: A participatory method for individual knowledge gap determination. Knowledge and process management, 5(4): 255-263. Walsham, G. 1993. Interpreting information systems in organizations. Chichester, West Sussex, England; New York: Wiley. Walter, R., Foottit, R., & Nelson, B. 2009. PSM system upgrade in response to an occupational safety and health administration NEP inspection. Process Safety Progress, 28(4): 338- 342. Ward, A. 2000. Getting strategic value from constellations of communities. Strategy & Leadership, 28(2): 4-9. Waters, G. S., & Caplan, D. 1996. The measurement of verbal working memory capacity and its relation to reading comprehension. Quarterly Journal of Experimental Psychology: Section A, 49(1): 51-79. Webb, G. 1997. Reconstructing deep and surface: Towards a critique of phenomenography. Higher Education, 33: 195-212. Wei, C.-P., Yang, C. C., & Lin, C.-M. 2008. A Latent Semantic Indexing-based approach to multilingual document clustering. Decision Support Systems, 45(3): 606-620. Wenger, E. 1998. Communities of practice : learning, meaning, and identity. Cambridge, U.K.; New York, N.Y.: Cambridge University Press. Wenger, J. L. 2001. Children's theories of God: Explanations for difficult-to-explain phenomena. Journal of Genetic Psychology, 162(1): 41. Wickens, C. D., & Hollands, J. G. 2000. Engineering psychology and human performance. Upper Saddle River, N.J.: Prentice Hall. Wilson, T. D. 2002. The nonsense of knowledge management. Information Research, 8(1): 1-8. Witten, I. H., & MacDonald, B. A. 1988. Using concept learning for knowledge acquisition. International Journal of Man-Machine Studies, 29(2): 171-196. Wolf, F. G. 2001. Operationalizing and testing normal accident theory in petrochemical plants and refineries. Production and Operations Management, 10(3): 292-305. Woodward, J. 1965. Industrial organization : theory and practice. London; New York: Oxford University Press. Wouters, P., Paas, F., & van Merrienboer, J. J. G. 2008. How to optimize learning from animated models: A review of guidelines based on cognitive load. Review of Educational Research, 78(3): 645-675. Wu, F., & Weld, D. S. 2010. Open information extraction using Wikipedia. Paper presented at the Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Yin, R. K. 2009. Case study research : Design and methods. Los Angeles, Calif.: Sage Publications. Zack, M. H. 1999. Managing codified knowledge. MIT Sloan management review, 40(4): 45. Zinn, D., Bowers, S., McPhillips, T., & Ludascher, B. 2009. Scientific workflow design with data assembly lines, Proceedings of the 4th Workshop on Workflows in Support of Large- Scale Science. Portland, Oregon: ACM. Zollo, M., & Singh, H. 2004. Deliberate learning in corporate acquisitions: post-acquisition strategies and integration capability in U.S. bank mergers. Strategic Management Journal, 25(13): 1233-1256. Zouaq, A., & Nkambou, R. 2009. Enhancing learning objects with an ontology-based memory. Knowledge and Data Engineering, IEEE Transactions on, 21(6): 881-893.

Appendix A

Notations of Procedure Chunking Heuristics

Set Variables Sn the n-th line of procedure statement Ki the i-th keyword of statement Sn, ki belongs to Sn vi the i-th verb in list V ni the i-th noun in list N Lightweight ontology of domain concepts TitleIndicator list of labels denoting of title V list of verb or derived phrases denoting predicates MV list of adverb or prepositional phrase denoting predicate modifiers N list of noun or derived phrases and abbreviations MN list of adjective or prepositional phrase denoting subject modifiers or object modifiers P list of preposition “to” following with clauses denoting goals of actions Cond list of conjunctions followed by noun or derived phrases denoting condition of action WaitingList list which contains any phrase indicating break in time LocationList() lists of units grouped based on their physical distances OperatorList list of noun or derived phrases denoting operators Indicator variables Sn.title flag to record whether Sn is a part of title Sn.meta-info flag to record whether Sn is a statement that contains meta-information Sn.maintext flag to record whether Sn is a statement contained in the main body of the procedure Sn.Predicate the predicate of Sn which denotes action Sn.PredicateModifier the modifier of predicate for Sn Sn.Subject the subject of Sn which denotes the actor Sn.SubjectModifier the modifier of subject for Sn. Sn.Object the object of Sn, which denotes the location of action in some cases. Sn.ObjectModifier the modifier of object for Sn, which denotes the location of action in some cases. Sn.Purpose the purpose of action for Sn Sn.Condition the requirement which must be satisfied to execute the action of Sn Sn.TiggerofTiming flag to record there is a waiting condition of Sn Sn.TriggerofLocation flag to record there is a location change of Sn Sn.TriggerofEvent flag to record that Sn is the event trigger of the procedure Sn.TriggerofActor flag to record there is an actor change of Sn Functions MetaInfoLocate() a function to identify location of meta information, given the location of title MainTextLocate() a function to identify location of main text, given the location of title Modify() a function to identify object, which is modified by the given words or phrases POST() a function which calls part-of-speech-tagging process

Appendix B

SPA User Interface

At the initial of procedure processing, the SPA enables recording the meta-information of procedures at the left bar, such as the refinery, operation unit, equipment location, etc. of the selected procedures.

Then users are required to select the format of procedure before processing of knowledge separation.

229

Based on the user-input format, the SPA separates declarative knowledge (procedure title and meta-information) and action knowledge (action instructions). All action instructions are processed in next step by default. However, the SPA allows users to add or remove any lines from procedures for following process.

The SPA provides several mechanisms for next tagging process. The “default” mode tags action components based on the learning mechanism of all past processed procedures in database.

230 Besides, the SPA also has tagging modes considering processed procedures in the same company, about the same operation unit, or from the same equipment location.

The SPA tags every action component for each instruction and shows outputs to users.

Then the SPA reports users following information identified from the procedure: different operator roles involved in this procedure, list of equipment involved in the procedure, list of actions will be applied by operators, and different conditions for the actions.

231

Then, the SPA requires users to select heuristics, including time-based chunking

(interruption), location-based chunking (equipment), and actor-based chunking (operator), which will apply in next phase of action knowledge chunking. Event-based trigger is the default heuristic, which is not shown in the list.

232 Once the chunking heuristics are selected, they are triggered by certain action components tagged in Phase Two. Action knowledge chunks are generated by highlighting chunk boundaries.

Users can delete, add, or move these boundaries to revise action knowledge chunks. They can also name each chunk with the tool.

233 Other than functions of action knowledge extraction and chunking, we also implement a function to seek commonality between procedures. The commonality is defined as the sharing of same behaviors and targets of behaviors among multiple procedures. Once users click the button of “Find Commonalities” at the top navigation bar, the SPA start to assess commonality among all action knowledge chunks in database, followed displaying all chunks with common operation action(s) in a list. Users can keep restrict the searching requirements on these chunks with common actions, such as setting up chunk size, and stinting on certain procedures, actors, or actions.

Besides, the SPA tool could generate a graph to show the relationship between all knowledge chunks with commonalities. In following figure, each node refers to an action knowledge chunk, and each link indicates how similar are two chunks, namely, the amount of common actions between two chunks. Nodes can be selected by users (the highlighted yellow node) to show the content and meta-information of the action knowledge chunk.

234

Appendix C

Recruitment Script

We are researchers in the College of Information Sciences and Technology at the

Pennsylvania State University. We are conducting a research study to investigate operators’ expertise and how operators exercise it along with use or non-use of procedures.

We are recruiting individuals to participate in an interview about failure recovery procedures. The objective of this interview is to understand how you use procedures as your respond to a failure at the plant. The interview will explore usage of a single procedure by you in a scenario. The interview will last approximately 45 to 60 minutes. Your answers will be recorded by the interviewer by typing them in an online tool.

You may be eligible for this study, at no cost to you, if your working experience as refinery operator is more than 2 years. Your participation in this study is voluntary. If you would like to participate in this research study, or have any questions concerning the research study, you can call Dr. Sandeep Purao at (814) 863-0017 or via email at [email protected] with your questions or concerns.

Appendix D

Interview Protocol

Understanding Expertise in and Use of Codified Knowledge by Operators in Petrochemical Refineries

1. Assigned ID of the interviewee:

2. The objective of this interview is to understand how you use procedures and how you respond to a failure at the plant by using or not using procedures. The interview will take around 60 minutes.

3. Can you talk for a few minutes about your daily work schedules?

4. What are the different tasks and activities you do?

5. We will consider the following scenario: (A printout of this description will be made available to you by the interviewer so that you can refer to this as needed) A thunderstorm has come through the area and resulted in a momentary loss of power to the entire complex. All motor driven equipment (pumps, fans, blowers, compressors) have stopped. The power has now returned.

6. What would be your position in this scenario?

7. What are the names or titles of recovery procedures that you might follow?

8. Where did you learn about those procedures? Where do you find those procedures? Is there a fixed-way that you are expected to carry out those procedures work?

9. Let's focus on ONE of the procedures and your work surrounding the instructions contained in that procedure.

10. Please select one of the above procedures that we will focus on for the purpose of our exploration. Which Procedure do you select?

11. First, let's think about how you use that procedure. It is our understanding that through your work, you consult the Procedure as needed instead of reading the entire procedure prior to execution. Is this true?

12. How do you read the instructions in the procedure?

237 13. Can you elaborate a little bit more on how you do this? What role do these display monitors play?

14. How do you decide the way to read the procedure instructions you selected?

15. Are there any parts of the Procedure you think important to read several times? (including meta-info, title, process instructions, caution, warning, notes, etc) Please tell us what these instructions are. It is OK to refer to step numbers in the Procedure in response to this question.

16. Why do you read these parts of procedure several times?

17. Are there any parts of the Procedure that you do NOT read? (including meta-info, title, process instructions, caution, warning, notes, etc) Please tell us what these instructions are. It is OK to refer to step numbers in the Procedure in response to this question.

18. Why do you NOT read these parts of procedure several times?

19. Is there anything else about using Procedures that you would like to share with us? (This can be a description of how you use or do not use a procedure for certain things, how you need more or less information in the procedure, or how you may selectively use different parts of the procedure or anything else.)

20. Recall the last time you modify a procedure when you execute it. Could you go over that incident to me?

21. We would like to understand how you use the procedure in the context of your work. Please walk us through everything you would do to recover from the failure described in this scenario. Please include how you refer to the written Procedure, and do tell us how you exercise judgment, interpret the instructions, communicate with others, and anything else that makes you an expert. We believe that your experience may make your action professional and beyond simply following the instructions in the written procedure. It is OK if your recovery process different from procedures. Please feel free to describe this at a normal, conversational pace.

22. The interviewer will tape-record your narrative and create a map of different actions you take. During this interview, the interviewer will create an action-map which representing linkages across different actions to. Please let the interviewer know when to start listening to you and constructing this picture.

23. It is likely that the entire procedure is too long for us for this exercise. Let's consider a slice of the scenario we are dealing with which is around 5 to 10 minute description. Please tell the interviewer how you define this slice, e.g. where it starts it and where it ends. Starts with

Ends with

238

24. At this time, the Interviewer will show you the picture she/he has constructed. Feel free to suggest changes, make corrections, additions and refinements as you see fit. When you are satisfied, please confirm that this map shows all information you consider to be important for the scenario slice.

25. Please help the interviewer identify important actions on the picture that are NOT included in the written procedures. These reflect your expertise that we know continues to be difficult for the rest of us to understand.

26. Will you use Multiple procedures to recover from failures in the same scenario discussed earlier? If yes, what are they?

27. Under what Condition will you use multiple procedures instead of a single procedure?

28. How do you manage to keep track of these multiple procedures simultaneously? (For example, how do you track where you are in different procedures? Do you write anything? Do you mark each step after completion to record its execution? How do you do this?)

29. Is there anything else you would like to share with us that might help us to better understand your expertise and use/non-use of one/more procedures for failure recovery?

30. Now that we have talked about how you use procedures in the context of this specific scenario. Can you talk a little bit about how do you share your knowledge with other junior operators?

31. Thank you for your participation in this survey and exercise. If you permit, we will take a photograph including your working station and working background.

We appreciate your time and help. If you have any questions about this project, feel free to contact Dr. Sandeep Purao, Penn State University at [email protected] or Jingwen He, Penn

State University at [email protected].

Appendix E

Example of Interview Transcript Fragment

Interviewee: No. 17 Time of interview fragment: 9’19” – 19’16” Length of interview fragment: 9’57” R: Researcher O: Operator

Speaker Transcription R Let's go into the scenario. Here is the scenario that there is a sun storm. A momentary loss of power in the entire plant, and all motor-driven equipment has stopped. Now the power comes back. Under this scenario, what procedures would you choose to make all equipment work as normal?

O Well, when you lose power, you don't know how long it's going to be down. We're assuming it's going to be a long time hours. So first thing we do we have a procedure in place what we have to do in the plant. We have to secure certain items in the plant, I mean right now. And we will make a phone call to our OS [operator supervisor] or the shift supervisor, and then determining how long the power is off and on. I am saying power can be off now, but it will be back within 30 minutes, we still make that call. Okay. Now, here comes power, it comes back. So that is when you're real busy. There are certain SOPs you got to follow, which certain sequences that you have to start with, start certain equipment and we have SOP for that that we follow.

R Can you mention the name of that?

O I can, I sure can. We get power back, okay. Now, your plant computers got to boot up. The operator as soon we get something showing on the screen, [such as] we got to start this air compressor. The air compressor is the main power [of air]. We have a lot of pneumatic control that we operate with [by power of air]. You can press all [the pneumatic control equipment] on the computer. But if there is no air you can't control anything. So that's [air compressor] we have to establish first. Then you go by area, [check] what's going on at the level on the surge tank. Then we will watch. And the alarms that come up, you find out if it's true or not. In this oil side, I'm just talking oil side [of all above actions]. And then we make sure that the temperature is not cold or not hot. But a lot of things have been going on. And we start the equipment, legs of shipping pumps; make sure that they are going. Anything that is on the bottom of pumps, that's part of equipment. Now that's how we're going. Okay, now, on the water side that is more critical. Since we shift our excess water to farmers. It is a [anonymous] Water District. We go into a waterway and that really controls. So EPA can get involved to send the water. And we get fine, cleaned up waters, after we've taken care of [the control]. So the water side, water we're making,

240

we will make sure that water can bypass. They might give you the right information you want, or this is what we have to do. At odd times we will not go to the farmers with this water, to make sure their filters are properly working right. So we're all meantime bypassing the filters. We got a big, big tank down, and you can see the tank there (the interviewee points the location on the map). At that point, we are really watching. The oil side is pretty much easy to control, but the water side is not. It's really critical. So we have procedures for that. By that time if the power is back, hopefully by that time we will have management here. If ICS is in command, systems are in place, we will make a determination when to go back, and that when we are starting pumps. We will make soft water for the steam plants. Of course they have to have water that's soft and no calcium.

R For all equipment you mentioned about, like the compressors, surge tank, is there any individual procedure for each equipment? Or there is just the overall procedure?

O It's that now I forgot to mention a very critical procedure on that. Now like that we have to start the compressor. But we have this, what is called VRU vapor recovery, and they help control elimination of gases in the atmosphere. So that is the second thing, start the compressors for the air, since we operate the plant. And then [start] the VRU compressors that take the gases off the plant.

R When we talk of procedures, like standard operating procedures, so what exactly do you mean? Like is this objective or this is like a rule?

O Well, I myself, I know the plant so I don't really have to be further to check. I might miss something. But the new operators I can make them check this and check that. So I might miss something too. [to check] Whether we're missing anything, there is a check list. If the computers are down obviously you can't see it on the screen so we have to do a binder the manual. And you have to go to each one [of the checklist]. There is the step by step [of the checklist]. You have to start certain equipment first before you get your other certain equipment. Like I said, the compressors, and the first thing you have to start [is the compressors], and then later [start] the leak recovery, and the wash tanks are pretty much all gravity. There is no pump, you haven’t to solve it, [and] there is a passive system. But the water side is all gravity. Then [if] to get it into the water, you have to get pumps shift it out or up the hill for soft water.

R On the water side, there is a lot I can think of right now you have to follow through. So like, so [for] the rest of questions, can you just speak one of the procedures you will follow, and we will ask the rest of questions. Could you pick one and tell us the name of the procedure?

O Yes. It's called it's an easy one, the WEMCO. The WEMCOs are the first set of equipment that starts cleaning the water. The produced water comes in the field. We separate the water this way [by WEMCO]. The equipment that handles the first stage of cleaning mechanically is WEMCO. It is a bit long vessel, And the official name is called [anonymous]. And what they do with the little bit of Pneumatic chemical [is that] it locates the oil that might be in the

241

water, and oil is skimmed off. So when they get to the last stage of that WEMCO, it is cleaner. There's the procedure on that. When we have the power failure, the level, there has to be a certain level, right? In order for it to work properly, the operator, if he's walking in and in order to walk in through each one, on the radio to tell the operators in here (control room), [field operator ask console to] raise the level or lower the level. If it's real dirty and we can't catch it right now, or there may be something that may affect that whole process, we can divert those WEMCOs to another the big tank down there.

R Were you like talking about procedure now, do you have like code or something to refer that procedure?

O No quotes. Just on code. Just like "make sure that the WEMCO is clean, are they good?" And we have departments that let us know, they measure the oil and the water.

Appendix F

Core Categories and Sub-Categories of Strategy to Apply Action Knowledge

Category Sub-category Frequency 1. Procedure driven strategy 156 1.01 comparing SOP and non-SOP doc 3 1.02 comparing task and SOP 1 1.03 recalling equipment feature/ location/ operation from SOP 8 1.04 comparing current and old SOP 7 1.05 recalling non-SOP 1 1.06 avoiding to deviate from SOP 1 1.07 assessing actions in SOP 3 1.08 being aware of updated SOP 1 1.09 checking notes of SOP to get recent change/updates 1 1.10 checking non-SOP 2 1.11 following SOP 25 1.12 following non-SOP 2 1.13 recalling recent changes/updates of SOP 2 1.14 checking SOP before operation 3 1.15 checking SOP during operation 12 1.16 reviewing SOP 3 1.17 checking SOP after operation 15 1.18 classifying SOP 4 1.19 checking SOP before operation 13 1.20 chunking SOP into sections 4 1.21 checking SOP before operation 5 1.22 assessing SOP complexity 3 1.23 locating SOP 11 1.24 locating SOP 18 1.25 recalling content of non-SOP 2 1.26 assessing SOP importance/ usefulness/ applying frequency 4

2. Domain knowledge driven strategy 221 2.01 recalling logic and dependency of action 16 2.02 deciding action based on prior action 16 2.03 simulating equipment operation/ feature/ structure 90 recalling mental model of equipment operation/ structure/ feature/ 2.04 10 function/ location inferring logic of equipment operation by metaphor to general 2.05 4 phenomenon 2.06 deciding action based on dependency relationship in operation 2 2.07 comparing equipment in different locations 8 2.08 verifying equipment status by sampling/ testing 7

243

2.09 verifying equipment status after action 19 2.10 inferring equipment status by logs 1 2.11 checking/ monitoring equipment status if predict abnormal 5 2.12 checking/ monitoring by sensing 3 2.13 seeking info from equipment to infer equipment status 6 checking/ monitoring/ maintaining/ making sure equipment on target 2.14 34 status, if not, taking correct action

3. Environment resource driven strategy 199 3.01 assessing equipment safety level 1 3.02 assessing length of time for action applying 22 3.03 assessing flexibility of action applying-time 4 3.04 assessing length of time for equipment changing 1 3.05 inferring incident based on sensing 3 3.06 inferring equipment status 6 3.07 inferring equipment status and deciding action for incident 2 3.08 deciding action based on equipment operation/ status/ feature 91 3.09 deciding action based on availability of resource 12 3.10 deciding action based on environment (e.g. weather) 5 3.11 comparing equipment status 1 3.12 comparing equipment operation/ status/ feature/ location 4 3.13 comparing current and old equipment 6 3.14 identifying (equipment) location 4 3.15 deciding timing/ location to apply action 22 3.16 inferring equipment status based on info on screen 4 3.17 inferring equipment status/ structure/operation based on sensing info 11

4. Goal driven strategy 87 4.01 assessing effectiveness of action to achieve goal 1 4.02 recalling target equipment status 7 4.03 inferring expectation 1 4.04 assessing flexibility to deviate from SOP to achieve goal 3 4.05 seeking second way to achieve goal other than SOP 9 4.06 assessing un-expectancy situation deviate from goal 2 4.07 deciding action based on task severity/ difficulty 5 4.08 seeking second way to achieve goal other than SOP 3 4.09 deciding action to response to incident/ alarm/ abnormal 9 4.10 assessing SOP requirements 1 4.11 deciding target SOP 9 4.12 recalling target equipment status 5 deciding action to achieve target equipment operation/ status/ 4.13 18 feature/ location 4.14 deciding action for safety purpose 3 4.15 deciding action to avoid failure/ abnormal/ certain situation 11

5. Experience driven strategy 208 5.01 deciding action against to assumption on general situation 2 5.02 simulating multiple approaches to apply action 4 5.03 determining mistakes on SOP based on prior experience 1

244

5.04 simulating extreme condition/ worst outcome of incident/ abnormal 5 5.05 inferring cause of alarm 6 5.06 deciding action based on assumption on general situation 3 5.07 recalling outcome of action 7 5.08 recalling outcome of prior incident/ un-expectancy/ abnormal 25 5.09 recalling cause of prior incident/ un-expectancy / abnormal 3 5.10 deciding action based on experience to deviate from SOP 7 5.11 recalling experience of similar task performing/ incident recovering 36 5.12 deciding action based on experience of applying similar task 18 5.13 deciding to avoid certain action 3 5.14 inferring cause of incident 4 5.15 recalling incident frequency / location 5 5.16 recalling incident severity 7 5.17 recalling incident facticity 1 5.18 recalling incident's possible cause 1 5.19 recalling task/ action complexity/ difficulty 22 5.20 being aware of possible abnormal/ incident 1 5.21 recalling task/ action importance/ frequency 9 5.22 recalling length of time for incident lasting 4 5.23 recalling task workload 17 5.24 recalling alarm facticity 2 5.25 recalling action importance 10 5.26 recalling action frequency 4 5.27 recalling alarm severity 1

6. Operator status driven strategy 268 6.01 assessing capability of other operator 5 6.02 seeking info from outside people 3 6.03 communicating with other operator to check other's action 7 6.04 coordinating and providing help to other operator 4 6.05 communicating/ making command of action to other operator 19 6.06 communicating with other operator to seek available resource 1 6.07 coordinating with outside people 8 6.08 predicting to get help from other operator 4 6.09 coordinating and updating progress with other operator 27 6.10 coordinating and shadow/ being shadowed with other operator 14 6.11 seeking help if predict/ detect abnormal 1 6.12 seeking info from other operator 9 6.13 accepting other operator's deviation form SOP 1 6.14 predicting location/ action/ assumption/ inference of other operator 35 6.15 informing incident to other operator 6 6.16 comparing info from other operator and SOP 1 6.17 simulating other operator's action 19 6.18 deciding action based on other operator's command/ requirement 7 6.19 waiting and passing task to other operator/ outside people 7 6.20 discussing and making commands to outside member 1 6.21 discussing SOP with other operator 1 6.22 waiting for other operator' s command 3 6.23 getting remind from other operator 1

245

6.24 determining responsibility vested in 34 6.25 informing incident and equipment status to outside people 4 6.26 informing equipment status to other operator 10 6.27 informing action to other operator 7 6.28 communicating and sharing knowledge with other operator 2 6.29 considering self-capability to operate 26 6.30 assessing familiarity to equipment 1

VITA

Jingwen He

Education

Ph.D. 2015 Information Sciences and Technology Pennsylvania State University M.Phil 2008 Information Systems City University of Hong Kong B.Eng. 2006 Management Information Systems Renmin University of China

Professional/Teaching Experience

Research Assistant 2008-2014 Pennsylvania State University, Sate College, PA  Project Lead for Petrochemical Industry Knowledge Management Project (2009 – 2014)  Field Study on IT Development Projects (2008 – 2009) Teaching Assistant 2008-2009 Pennsylvania State University, Sate College, PA  Provided stimulating learning environment employing a broad range of instructional techniques to retain student interest; taught “Networking Telecommunication” course. Research Assistant 2006-2008 School of Business, City University of Hong Kong, Hong Kong  Designed and developed a prototype of intelligent multi-agent system to enhance the learning process: developed laboratory experiments and questionnaire; conducted tests; collected and analyzed data; presented research findings on international conference.

Publications

 He J., Purao S., Backer J., and Strobhar D. (2011) Service Extraction from Operator Procedures in Process Industries. 6th International Conference on Design Science Research in Information Systems and Technology (DESRIST’11), Milwaukee, WI. Nominated as best student paper award.  Karunakaran A., He J., Purao S., and Cameron B. (2009) Book Chapter: Growth Trajectories of SMEs and the Sensemaking of IT Risks – A Comparative Case Study. Technology Innovation: Entrepreneurial Successes and Pitfalls. Hershey, PA: Business Science Reference.  Srivatsan V., Purao S., Jansen J., and He J. (2009) System Developers Define Their Own Information Needs. 15th Americas Conference on Information Systems (AMCIS’09), San Francisco, CA.  He J., Lai H., and Wang H. (2008) A Cyc-Based Multi-agent System. 41st Annual Hawaii International Conference on System Sciences (HICSS’08), Big Island, HI.  He J., Lai H., and Wang H. (2009) A Commonsense Knowledge Based Supported Multi- agent Architecture. Expert Systems with Applications, 36(3), 5051 – 5057.  Lai H., Wang M., He J., and Wang H. (2008) An Agent-based Approach to Process Management in E-Learning Environments. International Journal of Intelligent Information Technologies, 4(4), 18 – 30.