INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

UMI University Microfilms International A Bell & Howell Information Company 300 Nortfi Zeeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600

Order Number 0211129

A task analysis of chemical and biotechnological process synthesis

Gandikota, Murthy Srinivas, Ph..

The Ohio State University, 1991

UMI 300 N. Zeeb Rd. Ann Aibor, MI 48106

A TASK ANALYSIS OF CHEMICAL AND BIOTECHNOLOGICAL PROCESS SYNTHESIS

DISSERTATION

Presenled in PartialFulfillment of the Requirements for the Degree Doctor of Philosophy in the (Graduate School of The Ohio State University

By

Murthy Srinivas Gandikota, B.Tech, M.S.

*****

The Ohio Stale University

1991

Disserlaion Committee: Approved By

Dr. J. F. Davis Dr. T. Bylander Dr. S. T. Yang Advisor Department of Chemical Engineering Copyright By Murthy Gandikota 1991 To my parents ACKNOWLEDGEMENTS

I thank all those who: financed my graduate school expenses even though my car was never insured during this time; in an argument disagreed with me and gave their justification; taught me what I thought I didn’t have to know; inspired me to seek knowledge by scaling dizzy heights and then fall freely to enjoy levitation; showed me affection and love mostly during day time affairs; rejected my advances because usually on those days 1 ended up working for my dissertation.

1 thank: anonymous posters on Usenet who shared their wisdom, free of charge, in public domain—especially the folks in comp.lang.lisp, comp.ai, and soc.culture.indian; the computer operators who usually made them all work, and then restored my files or reconstructed the hard disks when they didn’t; the High Street people who kept their places open for my visits even during unearthly hours—business is business; OSU for providing excellent recreational facilities in Larkins and Jesse Owen’s Centers—for me graduate study has been 50% inspiration in my office and 50% perspiration in these places.

If 1 didn’t thank someone explicitly then please let me know, 1 will do so, when 1 write a book for commercial sale.

Ill VITA

August 15, 1964 ...... Bom - Visakhapatnam, India

July 1986 ...... B. Tech (Hons.) Department of Chemical Engineering Indian Institute of Technology Kharagpur, India

December, 1988 ...... M.S., Chemical Engineering Department of Chemical Engineering The Ohio State University Columbus, OH

IV PUBLICATIONS AND PRESENTATIONS

Gandikota, M., “Temporal Constraints on Event-Oriented Diagnosis and Corrective Action Planning of Dynamic Process Systems,” Tech Report, AI Applications Group, Department of Chemical Engineering, The Ohio State University, Columbus, OH, 1991. “A Knowledge-Based System for BioProcess Synthesis,” by M.S.Gandikota, J.F.Davis, S T. Yang and J. Marchio, paper invited for publication in ChemTech, 1991. “A Task-Oriented Framework for Biochemical Process Flowsheet Synthesis,” by M.S.Gandikota, S.T.Yang and J.F.Davis, in proceedings of AIChE Conference, Chicago, November, 1990 and American Chemical Society Conference, Washington, D ., August, 1990. “An Integrated Framework for Intelligent Computer Aided Design of Chemical Processes,” by N.Hari Narayanan, M.S.Gandilcota and J.Maroldt, in proceedings of 2nd International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, July, 1990. “An Expert System Framework for the Preliminary Design of Process Flowsheets,” by M.S.Gandikota and J.F.Davis, in proceedings of the International Conference on Knowledge-Based Computer Systems, pg.88-104, Bombay, India, Dec. 11-13, 1989. “Building Expert Systems for Selection in Engineering Design,” by M.S.Gandikota and J.F.Davis, presented at the AIChE Annual Meeting, San Francisco, CA, Nov. 5-10, 1989. “A Knowledge-Based System Framework for Diagnosis in Process Plants,” by S.K.Shum, M.S.Gandikota, et al., in proceedings of the 7th Power Plant Dynamics, Control and Testing Symposium, pg.22.01, Knoxville, TN, May 15-17, 1989. “Rule-Based Expert Systems in Chemical Engineering,” by J.F.Davis and M.S.Gandikota, Computer Aids for Chemical Engineering (CACHE) Monograph Series, August, 1988. “An Intelligent Database for Process Plant Expert Systems,” by R.Bhatnagar, M.S.Gandikota, et al., in proceedings of the ISA Conference, pg.373-385, Houston, TX, Oct. 16-21, 1988. FIELDS OF STUDY

Major Field: Chemical Engineering

Studies in the application of artificial intelligence towards process synthesis problem solving in Chemical and Biotechnology domain.

Advisor: Dr. James F. Davis

VI TABLE OF CONTENTS

DEDICATION ...... ii

/i(:#ir4()iifiJE:D(:;Eivii%hrirs...... H:

VITA ...... iv

LIST OF FICLRES ...... xüi

LIST O F T A B L E S ...... xv

CHAPTER PACE

I. INTRODUCTION ...... I

1.1. MOTIVATION FOR KNOWLEDGE-BASED SYSTEMS IN PROCESS SYNTHESIS...... 2 1.2. LIMITED ROLE OF CURRENT COMPUTER-AIDED DESIGN TOOLS ...... 3 1.3. AI IN CHEMICAL ENGINEERING...... 5 1.4. PRIMARY RESEARCH FOCUS: BIOPROCESS SYNTHESIS 7 1.5. SECONDARY RESEARCH FOCUS: DEVELOPMENT OF PROCESS SYNTHESIS METHODS ...... 8 1.6. ORGANIZATION OF THE DISSERTATION...... 9

II. A TASK APPROACH TO A KNOWLEDGE BASED SYSTEM FOR SELECTION AND SYNTHESIS OF BIOPROCESSES 12

2.1. INTRODUCTION ...... 12

VII 2.2. THE SELECTION PROBLEM ...... 13 2.3. TASK ANALYSIS OF ROUTINE SELECTION PROBLEM ... 14 2.3.1. Preliminary Selection T ask ...... 14 2.3.2. Critical Selection Task ...... 15 2.4. CRITICAL SELECTION INFERENCE STRATEGY...... 16 2.4.1. Activation Phase ...... 16 2.4.2. Consistency Checking Phase ...... 16 2.4.2.1. Primary Consistency Check Algorithm ...... 18 2.4.2.2. Secondary Consistency Check Algorithm ...... 20 2.3. AN EXAMPLE OF CRITICAL SELECTION...... 21 2.6. FAILURE HANDLING ...... 23 2.6.1. Preliminary Selection Failure Handling Algorithm 24 2.6.2. Critical Selection Failure Handling Algorithm ...... 25 2.7. SELECTRIX: A KNOWLEDGE BASED SYSTEM SHELL FOR SELECTION ...... 25 2.8. SUMMARY ...... 26

HI. APPLICATION OF SELECTRIX TO BIOPROCESS SYNTHESIS...... 27

3.1. KNOWLEDGE NECESSARY FOR BIOPROCESS SYNTHESIS 27 3.2. KNOWLEDGE ABOUT SEQUENCING OF PROCESS STEPS 29 3.3. KNOWLEDGE OF UNIT OPERATIONS...... 31 3.4. KNOWLEDGE OF INTERACTIONS AMONG UNIT OPERATIONS ...... 36 3.5. REFINEMENT OF A FLOWSHEET ...... 37 3.6. KNOWLEDGE-BASED SYSTEM PROTOTYPING AND TESTING ...... 37 3.7. CASE STUDY ...... 38 3.8. COMPLEXITY OF BIOPROCESS SYNTHESIS PROBLEM SOLVING ...... 46 3.9. IMPROVEMENTS TO THE KNOWLEDGE-BASED SYSTEM 47

IV. LEARNING PROCESS SYNTHESIS CONSTRAINTS AND GENERALIZING THEM INTO HEURISTICS ...... 49

4.1. REPRESENTATION OF PROCESS SYNTHESIS CONSTRAINTS...... 50 4.2. THE CONSTRAINT LEARNING ALGORITHM ...... 54 4.2.1. Generating Selection Constraints ...... 55 4.2.2. Generating Interaction Constraints ...... 55 4.2.3. Generating Synthesis Constraints ...... 56 4.2.4. Generating Refinement Constraints ...... 56 4.3. LEARNING HEURISTICS FROM CONSTRAINTS...... 57 4.3.1. Generalizing Selection Constraints for a Product...... 57

VIII 4.3.2. Generalizing Synthesis and Refinement Constraints 58 4.3.3. Generalizing Interaction Constraints ...... 58 4.4. VALIDATION OF HEURISTICS ...... 60 4.5. APPLICATIONS OF THE INDUCTIVE LEARNING ALGORITHM ...... 61 4.5.1. Can unit operation be used for process step to make product

? ...... 61 4.5.2. What is the most frequently used unit operation for process

to make product

? ...... 62 4.5.3. What is the arithmetic relationship between variable and variable ? ...... 62 4.5.4. Is there a precondition for processing feed from process

to process < p2> ? ...... 63 4.6. SUMMARY ...... 64

V. LEARNING CLASSIFICATION HIERARCHIES...... 65

5.1. INTRODUCTION ...... 65 5.2. AXIOM #1 : EVIDENCE-CLASS TABLE ...... 67 5.3. AXIOM #!.I ...... 68 5.4. AXIOM#1.2 ...... 68 5.5. AXIOM#2: EVIDENCE SET OF A CLASS...... 68 5.6. AXIOM#3: POSITIVE AND NEGATIVE EVIDENCE-SETS . 68 5.7. AXIOM#4...... 69 5.8. CONJECTURE#I; SUBCLASS RELATIONSHIP...... 69 5.9. CONJECTURE#2...... 69 5.10. CONJECTURE# 3 ...... 70 5.11. AXIOM #5: ESTABLISH-REFINE TABLE...... 71 5.12. AXIOM# 6 ...... 71 5.13. CONJECTURE# 4 ...... 72 5 .14. CON J ECTUR E # 5 ...... 72 5.15. C0N JECTU RE#6...... 72 5.16. CONJECTURE#7...... 72 5.17. C0NJECTURE#8 ...... 72 5.18. AXIOM#7 ...... 73 5.19. AXI0M#8: DEGENERATED PREDICATES...... 73 5.20. AXIOM#9: DOMAIN THEORY PREDICATES...... 73 5.21. AXIOM#IO...... 74 5.22. AXIOM#! 1 ...... 74 5.23. C0NJECTURE#9: EVIDENCE PARTITION FOR INTERMEDIATE CLASS...... 74 5.24. THE I ICI ALGORITHM FOR THE CONSTRUCTION OF CLASSIFICATION HIERARCHIES ...... 75 5.24.1. Algorithm: H C l ...... 75 5.25. HCI ILLUSTRATION FOR A REAL-WORLD PROBLEM .. 81

IX 5.25.1. Case Study ...... 81 5.26. CONCLUSIONS...... 85

VI. CASB-nASED REASONING APPROACH TO SYNTHESIS OF PROCESS FLOWSHEETS ...... N6

6.1. INTRODUCTION ...... 86 6.2. CASE-BASED REASONING VS. TASK-BASED APPROACH 87 6.3. CONTROL STRUCTURE OF CASE-BASED REASONER FOR PROCESS SYNTHESIS (CBR-PROCSYN)...... 88 6.3.1. Constraints on selection of processes ...... 88 6.3.2. Constraints on Synthesis of Processes ...... 89 6.3.3. Constraints on Input/Output Substances and Compositions. . 89 6.3.4. Constraints on Process Conditions ...... 89 6.3.5. Algorithm: CBR-ProcSyn ...... 89 6.4. RETRIEVAL AND SELECTION OF PROCESS CASES ...... 91 6.5. MODIFICATION OF PROCESS CASES ...... 93 6.5.1. Addition of Process Features ...... 95 6.5.2. Deletion of Process Features ...... 95 6.5.3. Modification of Process Features ...... 96 6.6. SYNTHESIS OF CASES ...... 96 6.7. EXTENSIONS TO CASE-BASED REASONER FOR PROCESS SYNTHESIS ...... 96 6.8. CASE-BASED REASONER FOR AMMONIA PROCESS SYNTHESIS...... 97 6.8.1. Ammonia Process Synthesis Case Study# 1 ...... 98 6.8.2. Ammonia Process Synthesis Case Study#2 ...... 101

VII. AN INTEGRATED FRAMEWORK FOR INTELLIGENT COMPUTER AIDED SYNTHESIS OF CHEMICAL AND BIOCHEMICAL PROCESSES...... 106

7.1. INTRODUCTION ...... 106 7.2. THE INTEGRATED PROCESS SYNTHESIS FRAMEWORK . 107 7.3. TASK DECOMPOSITION OF PROCESS SYNTHESIS 109 7.4. ANALYSIS OF PROCESS SYNTHESIS TASKS ...... 110 7.4.1. Skeletal Design ...... 110 7.4.2. Design Refinement: ...... 112 7.4.2.1. Simulated Discovery ...... 112 7.4.2.2. Functional and Structural Refinement (FSR) 113 7.4.2.3. Flowsheet Evaluation (FSE) ...... 114 7.5. METHODS FOR PROCESS IDENTIFICATION AND SYNTHESIS TASKS ...... 114 7.5.1. Constraint-Based Reasoning ...... 115 7.6. METHODS FOR SIMULATED DISCOVERY TASK ...... 118 7.6.1. Critiquing ...... 118 7.6.2. Qualitative Simulation ...... 119 7.6.3. Quantitative Simulation ...... 121 7.7. METHODS FOR FSR TASK ...... 122 7.7.1. Heuristic Reasoning ...... 122 7.7.2. Case-Based Reasoning ...... 124 7.8. CONTROL STRUCTURE FOR PPD ...... 127 7.8.1. Respecification ...... 129 7.8.2. The Simulated-Discovery-FSR Loop ...... 129

VIII. LITERATURE REVIEW ...... 133

8.1. TAXONOMY OF CHEMICAL PROCESS SYNTHESIS METHODOLOGIES...... 133 8.2. CASE-BASED REASONING (CBR) ...... 135 8.2.1. Scheduling ...... 133 8.2.1.1.SUPERMOM ...... 136 8.2.1.2. TRUCKER ...... 1.^6 8.2.2. Diagnosis ...... 136 8.2.2.1. CASEY ...... 136 8 2.2.2. PROTOS ...... 136 8.2.3. Design ...... 136 8.2.3.1. CYCLOPS ...... 136 5.2.3.2. KRITIK ...... 1.17 8.2.4. CLAVIER ...... 1.17 8.2.5. Planning ...... 137 8.2.5.1. Battle Planner ...... 137 8.2.5.2. CHEF ...... 137 8.3. LEARNING ...... 138 8.3.1. LEAP ...... 138 8.3.2.1D3 ...... 1.18 8.3.3. PRIDE ...... 1.18 8.3.4. Meta-AIMS ...... 138 8.3.5. STRUCT ...... 1.19 8.3.6. KEDS ...... 139 8.3.7. BRIDGER & BOSS ...... 1.19 8.4. GENETIC ALGORITHMS ...... 139

IX. CONCLUSIONS AND RECOMMENDATIONS...... 141

9.1. CONCLUSIONS...... 141 9.2. RECOMMENDATIONS ...... 145

XI BIBLKKÎRAPHY ...... 147

APPENDICES

A. KNOWLEDGE BASE FOR BIOPROCESS SYNTHESIS ...... 152

B. TRACE OF H C l ...... 166

C. GENETIC ALGORITHM FOR MULTICOMPONENT SEPARATION ...... 175

D. CBR KNOWLEDGE BASE ...... 181

XII LIST OF FIGURES

FIGURES PAGE

Figure 1. 30BJECT consistency check rules ...... 17

Figure 2. 2object consistency check rules ...... 19

Figure 3. OCN Example ...... 22

Figure 4. Sequencing of process steps for a recombinant protein p ro c e s s ...... 30

Figure 5. Bioprocess for an intracellular protein ...... 32

Figure 6. Hierarchical classification of bio-process steps and their unit operations ...... 33

Figure 7. Final refinement of flowsheet#! in Figure II using refinement constraints in Figure 12 ...... 39

Figure 8. Design problem statement ...... 40

Figure 9. Results of preliminary selection for rONA protein synthesis ...... 42

Figure 10. Object-constraint networks (OCN) for selection of alternative unit operations ...... 43

Figure 11. Alternative flowsheets for case study ...... 44

Figure 12. Refinement constraints for rONA flowsheet# 1 in Figure 11 ...... 45

Figure 13. HC Transformations#l,2,3 ...... 77

Figure 14. HC-Transformations#4,5 ...... 78

Figure 15. HCl Illustration#!—Input ...... 79

Figure 17. Classification hierarchy for FCC (KE’s version) 82

Figure 18. Input to HCl for the FCC problem ...... 83

Figure 19. Hierarchy generated by HCl for Case#l ...... 84

Figure 20 Classification of cases ...... 92

Figure 21 Scoring procedure for selecting cases ...... 94

Figure 22 Ammonia process with hydrogen recovery ...... 102

XUl Figure 23 Ammonia process retrofitted with coal gasification and hydrogen recovery ...... 105

Figure 24, Process synthesis task analysis ...... 110

Figure 25. Methods for process synthesis tasks ...... 118

Figure 26. Control structure of CBR-ProcSyn ...... 126

Figure 27. Control structure of PPD framework ...... 130

Figure 28. FSR Exam ple ...... 132

XIV LIST OF TABLES

TABLES PAGE

Table 1. Synthesis constraints on bioprocesses ...... 34

Table 2. Some examples of refinement constraints used in bioprocess synthesis ...... 38

Table 3, Evidence-Class Table ...... 67

XV CHAPTER I

INTRODUCTION

Process synthesis, fundamentally, is a heuristic search problem where the search space is all possible process interconnections or flowsheets. The heuristic search should find a particular flowsheet called a ’feasible flowsheet’ for a given product. A feasible flowsheet contains a subset of available unit operations for product synthesis, and satisfies all the design problem constraints. The feasible flowsheet also encompasses a particular structural sequence of unit operations. By altering the sequence or specifying different unit operations in a feasible flowsheet, while still conforming with the design problem constraints, another feasible flowsheet can be generated.

Mathematically, when there are p processes and u unit operations to carry out each process, p" is the number of possible flowsheets corresponding to a particular way the processes are structurally connected. If there are.v structural interconnections, then, up" is the total number of possible flowsheets; and the number of feasible flowsheets satisfying the design problem constraints is less than or equal to sp". because of the combinatorial complexity in synthesizing flowsheets, heuristic search is often employed to make the synthesis problem tractable. 1.1. MOTIVATION FOR KNOWLEDGE-BASED SYSTEMS IN PROCESS SYNTHESIS

This dissertation is about knowledge-based systems for process synthesis or synthesis during preliminary design. The goal of the knowledge-based system is to automatically create process flowsheets to transform a given raw material into a desired product. The task of such process synthesis can be regarded as a search problem that involves generating and testing alternatives. It has been hypothesized that only 1% of the flowsheets generated will ever get implemented (Douglas, 1985); consequently, for every process that is designed, optimized, and simulated, about 99 processes are rejected conceptually by qualitative testing.

The primary motivation for a knowledge-based system to do process synthesis stems from the observation that human designers qualitatively propose several possible flowsheets for a product. The designers may then apply quantitative simulation and optimization techniques to determine the best process. The quantitative simulation and optimization techniques per se are not a focus of this work. This work attempts to characterize the qualitative reasoning as heuristic search. The goal of the heuristic search is to find feasible processes to make a given product by applying heuristic criteria which are stated as heuristics in the literature.

This dissertation introduces at least two novel Artificial Intelligence (AI) methods for process synthesis problem solving: constraint- and case-based reasoning approaches.

These methods are demonstrated in knowledge-based systems to assist human designers for systematically generating process flowsheets. In constraint-based reasoning, the heuristic criteria are represented as constraints. The constraints are then systematically applied for selecting a process by testing alternatives and synthesizing feasible flowsheets. Under case-based reasoning, previously synthesized process flowsheets are used for selection and synthesis of processes. The dissertation also focuses on the learning of process synthesis knowledge for use by a knowledge-based system. There are two distinct types of knowledge used in the aforementioned process synthesis methods: constraints and cases. Additionally, there are classification hierarchies of processes not mentioned above. Inductive methods are developed to learn these knowledge representations on-line by a knowledge-based system.

Quantitative knowledge, such as mathematical equations and models, comprises a significant portion of the synthesis knowledge too. The conceptual framework presented by this dissertation argues for use of quantitative techniques along with qualitative methods. In this dissertation, however, the integration of the qualitative and quantitative techniques is not implemented because of several reasons: (i) there is a greater need to represent process synthesis knowledge comprehensible by the designers who are already familiar with quantitative techniques in their area, (ii) it should be easy to integrate quantitative techniques into the knowledge-based/expert system software, using “foreign-function” interfaces and

“pipes” already built into programming languages [ex: (Steele, 1986), CLOS

(Keene, 1989)] and operating systems (ex: ) respectively, and (iii) it is more advantageous to provide “hooks” in the Icnowledge-based environment than offer a smorgasbord of quantitative techniques that constantly evolve in parallel to ai methods; so this level of integration can happen as-and-when-required, for instance, when a designer decides to generate quantitative data by simulation to verify a result from qualitative reasoning.

1.2. LIMITED ROLE OF CURRENT COMPUTER-AIDED DESIGN TOOLS

The spectrum of computer-aided tools for chemical engineers ranges from programs like

Flowtran, Aspen, Acsl (Advanced Continuous Simulation Language), etc., ChemShare databases, IMSL routines to in-house Fortran program. However, a significant part of chemical engineering design education as well as practice involves process simulation (and simulation of controllers for designing control loops). The traditional simulation programs for design require the designer to specify particular sequence of unit operations in a simulation program. That is, a flowsheet in the form of exactly known processes and connections between them. However, if the process is not available in the library of models accessible for the simulation program, it is not possible to simulate the process. Also, the simulation tool cannot help the designer in creating a new model by modifying, inheriting or combining models existing in the library.

Traditional simulation programs are limited by the nature of computing within the quantitative simulation methods. After a flowsheet has been represented in a simulation program, the actual simulation is done on the process variables and stream properties. The output of a simulation is usually a report containing the operating conditions of various streams and process units, e.g. steady state flowrates, temperature and pressure. The computation underlying the simulation largely involves solving mathematical models and differential equations that describe the input-output behaviors of various process units, which is incomplete for many analytical purposes.

A simulation report cannot verify a design in case of failures, for instance, when a designer is faced with a very tall distillation column or an extremely wide absorber. The simulation models also cannot indicate the failure situations by analyzing the report (how tall is tall or how wide is wide) and of be assistance to the designers during redesign. The simulation program does not help in determining if there is an alternative process to circumvent the failure. Following a simulation, a designer may have to abandon a process flowsheet after failing to redesign. Similarly, the simulation program does not support tweaking the processes, such as proposing altemate synthesis strategies like recycle loops or multiple reactors for increasing product yields.

Thus the traditional computer-aided designer methods in chemical engineering are, basically, not equipped with methods for solving problems that arise: before the simulation

(how can a model that does not exist in the simulation library be developed?) and after the simulation (what are the design failures?); because such problems require reasoning at a

conceptual level, or using qualitative knowledge. So, we often see process designers going

back and forth between a conceptual proposal and modification, giving rise to a

redesign-simulation loop, outside the computer-oriented environment, which is arguably

inefficient.

Finally, the simulation programs offer no explanations or justifications to their results. The program’s behavior is transparent to the users who would like to analyze in detail an unexpected simulation result, say contributing to a design failure and requiring redesign.

Using a knowledge-based system along with a traditional simulation tool will alleviate most of these problems. A knowledge-based system that analyze and generate flowsheets can help the designer to efficiently handle the redesign-simulation loop in case of insoluble simulation or design problems with a particular flowsheet. Thus, in the context of computer-aided design tools, the knowledge-based system will serve as a front end to the simulation tool. A combination of computer-oriented knowledge-based and traditional simulation tools can enable the designer to not only generate and test a large number of process flowsheets but can do it in a reasonable amount of time.

1.3. AI IIS CHEMICAL EISCINEERIISC

AI gives the necessary perspective to address those issues which have no formal basis within, or are simply outside the scope of chemical engineering mathematical modelling. AI technologies like loiowledge-based systems and qualitative simulation provide computers the ability to search, where straightforward mathematical solutions are non existent, using various forms of knowledge such as heuristics, constraints, design cases, qualitative models, etc. that constitute formal text-book as well as informal experiential chemical process synthesis domain knowledge.

The reason for applying AI in chemical engineering is: the mathematical solutions to chemical engineering problem sometimes are too complex and the assumptions made are obfuscatory. Also, mathematical complexity occludes our understanding of physical concepts. The culprit here is not mathematics per se; but the occlusion of physical concepts all professionals in a domain (chemical engineering) share, by mathematical principles only some will understand. Any computer-oriented design system developer, whether AI or non-AI, has to face the challenge of interpreting this theoretical (mathematical) complexity to justify or explain the system’s behavior and results; especially if a professional-user’s decision making is dependent upon it.

AI meets the challenge of solving complex problems as well as explaining the problem solving for our understanding. In other terms, knowledge-based systems can help chemical engineers to be more productive at work or efficient in making decisions; using an intelligent process synthesis system, for instance, a designer wUl be able to generate alternative flowsheets systematically for a given raw material and product specification.

It is possible that AI will eventually acquire the same criticism levelled against mathematical methods. Descriptions of programs and their internals tend to skirt the knowledge level issues involved in chemical engineering problem solving. AI methods, however, are more intuitive and the details of their implementation are hidden from eye. So if the presentation confines to knowledge level, there is little likelihood that the method is dismissed as too complex. On the other hand, the inevitability of discussions spilling over to programming representations, while cannot be stopped, can be contained at a level acceptable to all. And this level is currently very low, because chemical engineers are not expected to follow AI or computers eagerly anyway.

For the advancement of the field, knowledge-based modelling, like mathematical modelling, should be nurtured for its own merits: (i) intuitive to begin with, (ii) takes into account human subjectivity (better if experience-based), (iii) provides multiple representations for various kinds of knowledge, (iv) integrates diverse problem solving strategies, (v) provides confidence measures to decisions arrived by using multiple knowledge representations, and (vi) provides knowledge-level explanations for problem

solving.

1.4. PRIMARY RESEARCH FOCUS: BIOPROCESS SYNTHESIS

Recently, the application of AI in bioprocess design has received increasing attention. Given

a product of interest, there are four basic steps involved in bioprocess design:

(i) The first step is to find suitable biological routes or metabolic pathways for the production

of a desired product. This is to search for an existing biochemical pathway or to establish a

new pathway through genetic engineering of the existing organisms. Computer-aided

metabolic pathway synthesis (Seressiotis and Bailey, 1988; Mavrovouniotis etal., 1990) can

be used to help the establishment of a new pathway.

(ii) The second step is to generate a flowsheet structure for the process (this is referred to as process synthesis in this paper). Process synthesis is a task to realize a process structure in the

form of process steps and unit operations for large scale production. The problem solving here is mostly heuristic. Many heuristics have been reported in the literature for selecting process steps and unit operations and synthesizing them into a flowsheet structure for protein purification (Asenjo et al., 1989; Wheelwright, 1989).

(iii) The third step is process design and simulation. Here the task is to design the various equipment necessary to carry out the unit operations in the flowsheet structure created from

the previous process synthesis step. The process is then simulated to generate reports on equipment sizes, optimal operating conditions, energy utilization, various cost data, etc. depending on whether mathematical models are available, one may choose alternative methods for design simulation. Petrides et al. (1989) used an expert system to design and

simulate the production of porcine growth hormone, and Guthke et al. (1990a and 1990b) used fuzzy modeling for recombinant alpha interferon fermentation. Other applications of

AI for bioprocess simulation include ethanol fermentation (FUev et al., 1985), stationary and mobile phase selection, and the mode of separation for chromatographic systems (Peichang 8

and Hongxin, 1988) and optimization of chromatographic methods (Schoenmakers et al.,

1990).

(iv) The last step is process development. An industrial process is finally developed based on

the process design and simulation results. Tasks here include meeting FDA regulations,

environmental laws, etc.

The choice of a knowledge-based approach for the bioprocess synthesis problem is

motivated by several reasons. New bioprocesses are usually developed based on pilot plant

and/or simulation studies. The underlying notion is that if a particular arrangement of

bioprocesses can be shown to produce the desired product based on either experimentation or

simulation, then it can be scaled up for industrial scale production. While the feasibility of a

bioprocess is shown by experimentation, the computer simulation of the bioprocess forms

the basis for equipment scale up. Thus, simulation and testing form the core of the justification for bioprocess scale up.

1.5. SECONDARY RESEARCH FOCUS: DEVELOPMENT OF PROCESS SYNTHESIS METHODS

Besides, bioprocess synthesis problem solving, the research described in the dissertation

makes the following novel contributions:

® Object-Constraint Network (OCN) approach for handling selection and synthesis

subproblems of process synthesis

# An inductive learning algorithm for generating process synthesis constraints and

heuristics from a flowsheet input.

® An algorithm to generate hierarchical classification trees from classes and evidence sets. The dissertation also describes research done in applying the following AI methods to process synthesis problem solving:

® Task analysis to decompose the synthesis problem and characterize its search space

0 Constraint-based reasoning for handling qualitative synthesis constraints

0 Case-based reasoning for reasoning with previously designed processes and flowsheets

0 Genetic algorithms for multicomponent separation sequence synthesis using

non-distiUation separation processes Organization of the dissertation

1.6. ORGANIZATION OF THE DISSERTATION

The overall organization of this dissertation is as follows;

In Chapter 2, a decomposition of bioprocess synthesis into two tasks, selection and synthesis, is presented. The methods to solve the two tasks : Hierarchical Classification and Critiquing for selection are also described. A constraint-based reasoning method called

Object-Constraint Network(OCN) approach is proposed for synthesis.

In Chapter 3, bioprocess synthesis is studied in detail. A case study is included to demonstrate the task-based and constraint-based approaches for bioprocess selection and synthesis. The case study involves creating alternate process flowsheets for a rDNA protein product.

Chapter 4 describes a learning algorithm to automatically acquire constraints used by the selection and synthesis tasks described in Chapter 2. The learning algorithm needs input in the form of flowsheets typically available in the process synthesis literature. A procedure has also been described to formulate process synthesis heuristics by generalizing the selection and synthesis constraints created by the learning algorithm. 1 0

Chapter 5 presents several conjectures about how to create classification hierarchies for the selection task described in Chapter 1. Using these conjectures as basis, a learning algorithm, called HCl, to automatically generate classification hierarchies has been described. Using concepts that need to be classified and their evidences as input, the learning algorithm creates classification hierarchies.

Chapter 6 discusses case-based reasoning as an alternative to task-based approach for synthesis of process flowsheets. It is proposed that by providing information on actual flowsheets to a computer program, new process flowsheets can be synthesized automatically. This method does not assume availability of any heuristics in contrast to task-based approach. A learning component that enables a case-based reasoner to automatically update its knowledge base, using its own problem solving experience, is also described. A case study, to illustrate the case-based reasoning approach for synthesis of an ammonia process, is included in this chapter. The case study describes generation of feasible flowsheets, to make ammonia starting from coal, using the case-based reasoning.

Chapter 7 presents a task framework for chemical process synthesis. The task-oriented approach decomposes the process synthesis problem into several activities: (1) Skeletal

Design, (2) Design Refinement and (3) Design Evaluation. Each of these is a complex activity that can be further decomposed into a number of tasks. The activity of skeletal design consists of: (1.1) Analysis & Decomposition, (1.2) Device Selection and (1.3) Structure

Synthesis. The goal of skeletal design is to generate a rough flowsheet that consists of all the unit operations and their interconnections. Design Refinement comprises three tasks: (2.1)

Device Refinement, (2.2) Performance Analysis and (2.3) Device Modification. The goal of the design refinement task is to add stmctural details to the rough flowsheet generated by skeletal design so that the design satisfies the input constraints. The third activity Design

Evaluation, is decomposed into three tasks: (3.1) Design Verification, (3.2) Qualitative 11

Analysis and (3.3) Quantitative Optimization. The goal of design evaluation is to ensure the flowsheet is feasible and optimal.

Chapter 8 presents a literature review of all the major focuses in this dissertation such as: chemical and biotechnological process synthesis, task analysis, constraint-based reasoning, case-based reasoning and computer-oriented leaming.

Finally, Chapter 9 summarizes the major contributions from the research and presents recommendations for further exploration. CHAPTER II

A TASK APPROACH TO A KNOWLEDGE BASED SYSTEM FOR

SELECTION AND SYNTHESIS OF BIOPROCESSES

2.1. INTRODUCTION

Bioprocess design involves specification of an optimal interconnection of processing systems for conversion of raw materials into desired products. At each step of the design process, alternatives are generated and the best alternative is selected subject to the goals, specifications and constraints of the design. For example, to separate a mixture into pure components, several different separation operations such as distillation, absorption, drying, filtration, settling and membrane separation are available as alternatives. In turn each of these separation operations can be further subdivided into more specifically defined operations. For example, ultrafiltration, microfiltration and reverse osmosis are specific kinds of membrane separations

The knowledge for selection can be either qualitative or quantitative. The quantitative knowledge (eg. protein isoelectric property data, purification efficiencies, costs, etc.) as numerical data and mathematical equations is better expressed as an algorithm for implementation by conventional programming means. Whereas, the qualitative knowledge that comprises the expert strategies for conducting the search and testing constraints is better represented and put to use with a knowledge-based system approach. Toward this end, a generic task framework and programming shell for selection problem solving have been

12 13 presented in the following sections. Case studies to illustrate the approach are drawn from the bio-technology domain.

The rest of the chapter is organized as follows. First, the selection problem mentioned earlier is defined formally. Next, a task analysis of the selection problem is presented. The task methodology is demonstrated using an example. The chapter is concluded with remarks on selection failures and their handling methods.

The results of task analysis described in this chapter are then applied to bioprocess synthesis problem presented in Chapter 3. This chapter also establishes the necessary background for the following Chapters 4 and 5, which deal with the issues of computer-oriented leaming of selection knowledge used in task analysis.

2.2. THE SELECTION PROBLEM

A formal statement of the selection problem is:

Given a set of objects O, specifications S, Object attributes A the objective of selection is to determine a subset of objects that best meets the set of constraints C.

Constraints on the selection arise mainly from the specifications of the problem. For example, the constraints for the selection of a separation bioprocess, say centrifugation, may be about the nature of the material being separated such as an enzyme, antibiotic, bacteria and virus, or stream properties lilce pH, temperature, flowrate and so on. In addition to constraints from problem specifications, there are interaction constraints because of relationships between objects. Specifically, the possibility of interactions among objects in the form of: the selection of an object may affect the selection of other objects. The scope of these interaction constraints during problem solving is to cause local constraints. These local constraints affect relatively smaller sets of objects in contrast to global specification

constraints which are applicable to almost all the objects in the search space. 14

The selection of bioprocess operations has been characterized as a complex heuristic search problem (Asenjo et al., 1989) that is intractable for conventional numerical computation

(Siletti, 1990). Although the characteristics of the search space and the problem-solving strategies for selection problems are essentially the same for a given application, there are too many combinations of goals, attributes and constraints to use weak methods like table-look-up or generate-and-test. So a computer program for selection is a useful tool for bioprocess designers.

2.3. TASK ANALYSIS OF ROUTINE SELECTION PROBLEM

A routine selection problem is one where the sets O, S and C are fully specified in a selection problem. The selection theory proposed decomposes the selection activity into two tasks: preliminary selection and critical selection. In the following sections these two tasks are described in detail. The applicability of these tasks for routine selection has been previously demonstrated in several other knowledge-based systems for the selection of process equipment, (Gandikota, 1988), selection of bioprocesses to process waste streams (Dayal,

1990) and selection of separation processes for chemicals (Bamicki and Fair, 1990).

2.3.1. Preliminary Selection Task

The goal of preliminary selection is to identify a set of plausible objects that satisfy global constraints. They are only plausible objects because the preliminary selection task does not

consider local constraints of the selection problem. The aim of preliminary selection is to

prune as much of the search space as possible. The scope of preliminary selection, therefore,

is stated as:

Given a set of objects O, find a subset O l, that satisfies the set of global constraints G.

The task of preliminary selection involves hierarchical classification which has been

described in greater detail elsewhere (Chandrasekaran, 1986; Sik & Ramesh, 1989). 15

A well-organized hierarchical classification tree can minimize the computational

complexity of the search by pruning the branches of the tree without having to refine all the

higher level classes completely. Of course, this efficiency is realized only if the input data do

not cause the case where all objects have to be selected. In practice, such extreme cases are

rarely encountered. Generally spealcing, a significant number of objects can be pmned, thus

making the hierarchical classification strategy efficient. From the preliminary selection point of view, classification hierarchies can be based on the following relationships:

(a) Function-Subfunction Eg. Heat generation is a subfunction of distillation column condenser.

(b) System-subsystem Eg.: Reboiler is a subsystem of distillation.

(c) Type of Eg.: Spray drying is a type of liquid-solid separation.

(d) Part of Eg.: Impeller is a part of pump

For automatically generating classification trees, a program called HCl is developed during

this research. The details of HC1 are presented in Chapter 4.

2.3.2. Critical Selection Task

The result of preliminary selection is a subset Ol of the initial set of objects O, which

represents objects that are highly likely to be selected, finally; but the determination of which

is the most suitable object has not been made during preliminary selection. This has been

postponed until critical selection task. So the task of critical selection is determination of the

most suitable object by ranking the preliminary selection objects in Ol using known

relationships among the objects or relational constraints. Formally, the objective of the

critical selection task is:

Given a set of objects 01, from preliminary selection task, find a subset 02 that satisfies the

set of relational constraints R. 1 6

The knowledge for ranking the objects is expressed as the following relational constraints.

(a) Negation Constraints: These are of the type: “If A is selected then X is rejected.”

(b) Elimination Constraints: These are of the type: “If B is rejected then X is rejected.”

(c) Addition Constraints: These are of the type: “If C is selected then X is selected.”

(d) Reinforcement Constraints: These are of the type: “If D is rejected then X is selected.”

2.4. CRITICAL SELECTION INFERENCE STRATECY

The critical selection task is carried out by represented the various objects and their relational constraints as a directional-arc graph called Object-Constraint Network (OCN). A set of nodes for the selection objects, and directional arcs, corresponding to the various relational constraints among the objects, form a OCN. Tlie direction of an arc in OCN specifies the direction, from object to object, in which the constraint is applied (see Figure 1 ).

The critical selection problem solving using OCN’s is composed of the following steps.

2.4.1. Activation Phase

The OCN arcs are labelled active depending on whether their from- or to-objects are selected during preliminary selection. In terms of the constraint definitions, negation and addition arcs are activated if their from-objects are selected; elimination and reinforcement arcs are activated if their to-objects are rejected.

2.4.2. Consistency Checking Phase

After activating the OCN, the arcs are checked for consistency. Two types of arc-consistency checks are necessary to resolve any conflicts among constraint arcs and guarantee that a solution can be achieved after constraint propagation.

30BJECT Consistency Check: This check is made to resolve any conflicts when three objects are connected such that arcs from two of the objects are directed at the third object, 17

For example, read the Negation versus Reinforcement entry in the table as:

If al is a Negation arc and a2 is a Reinforcement arc then do the following: Remove object X from the OCN

I a2 Negation Elimination Addition Reinforcement al

Remove Object X Remove Object X Remove Object X Remove Object X Negation from OCN from OCN from OCN from OCN

Elimination Remove Object X Do Nothing Do Nothing Do Nothing from OCN

Addition Remove Object X Do Nothing Do Nothing Remove one from OCN of the arcs

Reinforcement Remove Object X Do Nothing Remove one Do Nothing from OCN of the arcs

Figure 1: 30BJECT Consistency Check Rules 18

Figure 1. The rules for the 30BJECT consistency check algorithm are also shown in Figure

1.

20B JECT Consistency Check: This check is made to remove any conflicts when two objects

are connected such that an arc from each of the objects is directed at the other. The rules for

20BJECT consistency checks are shown in Figure 2.

The second step of ranlcing implements the above consistency checks by applying the

following algorithms.

2.4.2.1. Primary Consistency Check Algorithm

Input: OCN

Output: Consistent OCN

1. Apply 3GB JECT consistency check on OCN

2. Are any changes made to OCN? If yes, then go to 1 else continue

3. Apply 20BJECT consistency check on OCN

4. Are any changes made to OCN? If yes, then go to 3 else continue

5. Are any changes made to OCN during this pass? If yes, then go to 1 else exit.

The worst-case-time-complexity of this algorithm on an input consisting of n nodes and k

connections per node is k*n in the worst case. In the best-case, the time-complexity is just k,

where n-1 objects are rejected by the negation constraints from a single object.

After the primary consistency check algorithm returns a consistent OCN, some simple

network consistency checks are done to arrive at the critical selection outputs. The secondary

network consistency check algorithm is as follows. 19

For example, read the Negation versus Reinforcement entry in the table as:

If a2 is a Negation arc and al is a Reinforcement arc then do the following: Remove Object X from OCN

al Negation Elimination Addition Reinforcement a2

Remove one of Remove Object Y Remove Object X Remove Object X Negation the arcs from OCN from OCN from OCN

Elimination Remove Object X Remove Remove Elimnation Remove Object X from OCN one of the arcs Arc from OCN

Remove one of Addition Remove Object Y Remove Elimnation Do Nothing from OCN Arc the arcs

Remove Object Y Reinforcement Remove Object Y Remove one of Remove one of from OCN from OCN the arcs the arcs

Figure 2: 20BJECT Consistency Check Rules 2 0

2.4.2.Z. Secondary Consistency Check Algorithm

1. If an object has no arc then the outcome of critical selection is the same as the outcome of preliminary selection

2. There cannot be self-referential arcs (i.e., if an arc begins at and ends at the same object then generate parse error)

3. If there is no network (i.e. in LISP, if network=nil) then report critical selection failure

4. If all the arcs are elimination arcs then report critical selection failure

5. All the objects at the end of negation arcs from selected objects are rejected

6. All the objects at the end of elimination arcs from rejected objects are rejected

7. All the objects at the end of addition arcs from selected objects are selected

8. All the objects at the end of active reinforcement arcs from rejected objects are selected

The worst-case-time-complexity of secondary consistency check algorithm is equal ton + e

+ a + r + k where, n, e, a and r are number of remaining negation, elimination, addition and reinforcement arcs respectively in the OCN. And k is a constant. In the best case the time complexity is just k which occurs when all the OCN arcs have been processed by the primary consistency check algorithm.

It is to be noted that, the basis for network consistency check rule 4 is, out of the four different types of relational constraints, only elimination constraints can cause zero selections at the end of ranking. Such a situation is called selection failure. Selection failures are handled separately by a failure handler that is invoked at the end of critical selection. The failure handling task will be described in more detail in section x. 2 1

2.5. AN EXAMPLE OF CRITICAL SELECTION

An example to illustrate the consistency checking algorithms wül be presented now.

Consider a OCN shown in Figure 3 (a) where the interaction constraints among the preliminary selection objects A, B, C, D and E are: cl: negation constraint from A to B c2: addition constraint from B to C c3: negation constraint from D to C c4: elimination constraint from D to E c5: negation constraint from E to F c6: reinforcement constraint from A to F c7: reinforcement constraint from C to F c8: addition constraint from A to D c9: elimination constraint from C to FI clO: elimination constraint from F to A

We first apply the 30BJECT consistency check (Figure 1 ) on the OCN in Figure 3(a) with the following results:

(a) Object B is rejected because cl is a negation and c7 is a reinforcement constraint directed toward B.

(b) Object C is rejected because c2 is an addition constraint and c3 is a negation constraint directed toward B.

(c) Object F is rejected because c5 is a negation constraint and c7 is a reinforcement constraint directed toward B.

The state of OCN after 30BJECT consistency check is shown in Figure 3(b) where the grey circles represent rejected objects. " I l

A, B, C, D, E, F: Preliminary Selection Objects cl, c3, c5; Negation Constraints c2, c8: Addition Constraints cio c4, c9, clO: Elimination Constraints c6, c7, cil: Reinforcement Constraints

c i l

Initial configuration of OCN

3-Object Consistency Check Results: cl & c7 reject B cIO c2 & c3 reject C cS & c6 reject F

cll

OCN configuration after the 3-object consistency check

2-Object Consistency Check Result: cIO c4 & c ll reject E c7

cll

OCN configuration after 2-object consistency check

cIO Result of Secondary Consistency Check Algorithm (Rule#6): CIO rejects A

cl I

Figure 3 : OCN configuration at the end of critical selection resulting in the selection of D and rejection of A, B, C, E and F. 23

Next we apply 20BJECT consistency check (Figure 2) on the OCN in Figure 3(b) with the following results:

Object E is rejected because c4 is an elimination constraint and cll is a reinforcement constraint (refer to 20BJECT consistency check rules).

The state of OCN after applying 20BJECT consistency check is shown in Figure 3(c), where the objects rejected so far, B, C, E, and F are shown in grey circles. The remaining objects for critical selection are A and D. At this stage the secondary consistency checking algorithm is invoked on the OCN in Figure 3(c). Rule#6 of the secondary consistency checking algorithm rejects object A because c 10 is an elimination constraint. The resulting OCN is shown in

Figure 3(d). The only remaining object is D which is selected as there are no further interaction constraints pointing to it from non-rejected objects.

2.6. FAILURE HANDLING

If there is no single object satisfying the constraints at the end of preliminary selection and critical selection, i.e. if all the object specialists are rejected, then an impasse is reached. A failure is an impasse that may be reversed by appropriate failure handling knowledge. For a preliminary selection failure, the failure handling strategy for a rejected object is give as:

1. Status of each global constraint as satisfied or unsatisfied that flags whether constraints have been met or not. Flagging a constraint is done by using the expected nominal value of a constraint for a successful selection. If a default value is not available then the global constraint is flagged as unsatisfied to indicate that a particular constraint could be the cause for the failure. However, this conclusion is not final without taking into account the pattern matching test.

2. The patterns in the specialist pattern matching test for establishing the unsatisfied constraints conclusively. The constraints in the condition elements of mles in the pattern 24 match test suspected to be causing failure as per ( 1 ) are to be verified with their true values as indeed the cause of failure.

The preliminary selection failure handling strategy is to use the above two kinds of information to propose constraint relaxation.

2.6.1. Preliminary Selection Failure Handling Algorithm

The preliminary selection failure handling strategy involves execution of the following steps:

Do the following until all the rules have been tried:

1. Determine the set of rows R in the pattern match test with confidence values greater than zero

2. Determine the most specific row r in R

3. Determine the set of unsatisfied patterns P in r

4. For each unsatisfied pattern p in P do

(a) Ask the user to give a revised specification, if possible.

(b) Check if all the conditions in P are satisfied

5. If P is not matched then select the next general r in R and go to 2. If P is matched then

change the status of the object from rejected to selected and exit.

The worst-case-time-complexity of this algorithm is given by R*P. At the end of

preliminary selection failure handling, if the status of none of the rejected objects has

changed then the problem solving is said to have reached an unresolved impasse.

An unresolved impasse is a dead-lock state attained when the design specifications are over

constraining and prevent a successful selection. Unresolved impasses should not be

construed as lack of knowledge. 25

An unresolved impasse means the constraint-based reasoning cannot solve the particular case the user has presented.

2.6.2. Critical Selection Failure Handling Algorithm

The critical selection failure is mainly because of the elimination constraints. The critical selection failure handling strategy involves the following steps on OCN:

With each active elimination arc in the OCN do the following:

1. Ask the user if the elimination constraint can be suspended. If it can be suspended then exit from failure and carry out critical selection. Else continue.

2. Ask the user if the preliminary selection has to be re-done. If the user wants to redo preliminary selection then exit from failure else report an unresolved impasse.

The worst-case-time-complexity of this algorithm is utmost E, equal to number of elimination arcs.

2.7. SELECTRIX: A KNOWLEDCE BASED SYSTEM SHELL FOR SELECTION

SELECTRIX is a knowledge-based system shell developed for implementing routine selection problems using the task analysis. As a shell it embodies the domain independent representation and problem-solving strategies associated with the preliminary and critical selection tasks and their integration. SELECTRIX, therefore, facilitates the encoding of relevant knowledge about the tasks for a specific problem as an object space, specifications and constraints. To build a Icnowledge-based system for selection the user need only declare the domain-specific object space and constraints from the input specifications and use the embedded task-oriented methods to solve selection problems. SELECTRIX is implemented in Common Lisp Object-oriented System [CLOS (Keene, 1989)]. 2 6

2.8. SUMMARY

In this chapter, a task analysis of selection problems has been presented based on the research in the domains of chemical process engineering and bio-technology. A routine selection problem was decomposed into two subproblem as: preliminary and critical selection.

Hierarchical classification and constraint-based reasoning have been shown to be efficient methods for preliminary and critical selection problem-solving, respectively. A knowledge-based system framework for computer-oriented selection problem called

SELECTRIX is developed. SELECTRIX is a knowledge-based system shell that facilitates knowledge acquisition, representation and implementation of routine selection problems using generic data structures to capture the object space and constraints. The inference for selection problem solving in SELECTRIX is accomplished by the embedded generic task methods.

In the next chapter, the application of task analysis incorporated in SELECTRIX is described for bioprocess synthesis probem solving. CHAPTER III

APPLICATION OF SELECTRIX TO BIOPROCESS SYNTHESIS

An application of SELECTRIX to bioprocess synthesis is presented in this chapter. The chapter describes in detail the necessary knowledge for selection and synthesis of bioprocesses, used to develop of a knowledge-based system for bioprocess synthesis. The ability of the knowledge-based system built on SELECTRIX to synthesize novel bioprocesses is demonstrated.

3.1. KNOWLEDGE NECESSARY FOR BIOPROCESS SYNTHESIS

The necessary knowledge for bioprocess synthesis is generally unorganized and scattered over many textbooks, patents, and journal articles. A human expert designer accumulates such knowledge over years of experience which distinguishes an expert from a novice. The availability of a human expert does not necessarily entail a knowledge-based system for various reasons. Most of all, the human expert should be able to articulate the design knowledge in a fonn suitable for knowledge-based system implementation. A knowledge engineer has to determine which knowledge can be directly represented and which cannot be. That is, the knowledge engineer has to reconcile between what is computationally feasible and infeasible because of lack of knowledge.

Besides the issue of what is computationally feasible and infeasible, there are technical limitations as to how far one can succeed in synthesizing not only correct but optimal

27 2 8 flowsheets for any given product from any given raw material in the biotechnology domain itself. Some of the limitations which arise from the lack of necessary knowledge are discussed in the following:

® Reaction Pathways: To generate a product Z from a raw material A, there should be a feasible biological reaction that transforms A to Z or a sequence of reaction pathways that perhaps, transform A to B to C to D...to Z. To answer the question if Z can be generated from

A, one has to either know a priori that it is possible or impossible, or be able to search through the known set of reaction pathways after taking into account stoichiometric constraints, restrictions on intermediate metabolites, etc. The latter problem, that involves synthesis of biochemical pathways by searching through biochemical reactions and application of constraints, has led to the development of interesting algorithms using Al methodologies

(Mavrovouniotis et al., 1990).

® Separation Processes: Some bioproducts have to be made at a very high purity because they have pharmaceutical, biomedical or food applications. If there is no separation method that can separate the desired product from a certain contaminant, human designers have to either develop a separation method or settle for a contaminated product (so long as it does not harm anyone). As a result, until a separation method is available the constraint on product purity cannot be met in the real world. This lack of information will be reflected directly in any knowledge-based system. Until a separation process is developed and made available to the system, it cannot recommend a separation process to meet a certain product purity constraint. The problem of proposing a novel separation method is a scientific endeavor, that is not a part of the process synthesis problem solving that the knowledge-based system is coached to deal with. 29

# Physical and Chemical Data: In the selection of processes it is important to consider physical and chemical data of bioproducts and bioreactions. The knowledge-based system caimot generate the necessary data if the data are not available, unless there is a model for generating such data. For instance, the system may be instructed to use the viscosity data of substance B for substance A, or provided with a reasoning method by analogy to derive viscosity of A from the known viscosity of B. That means, the system can only derive the unknown data by inference, if the human expert provides a basis for such an inference.

In addition to the basic knowledge discussed above, knowledge about process step sequence, functions and types of various unit operations used in bioprocessing, and knowledge from prior experience are needed for building the system for bioprocess synthesis (see Appendix

A for the knowledge-base for bioprocess synthesis).

3.2. KNOWLEDGE ABOUT SEQUENCINC OF PROCESS STEPS

At the heart of bioprocess synthesis problem solving is the knowledge about how to connect various process steps. In the development of a continuous process, constraints about how to sequence process steps are specified to ensure appropriate inputs to downstream processes from upstream processes. There are several heuristics in the literature which describe how to sequence processes so that the resulting flowsheet meets the constraints on continuous processing as well as optimality. Figure 4 shows a sequencing scheme for processes making recombinant proteins.

The sequencing scheme in Figure 4 starts with Sterilization, followed by Fermentation and

Cell Separation. After Cell Separation, depending on whether the protein is intracellular or extracellular an appropriate path is chosen. In an extracellular path, two steps involving

Concentration and Purification will result in the product. Intracellular product recovery is complicated because the product is located within the cells and may also be in a soluble or insoluble form. In this case, after Cell Separation, Cell Dismption and Product Recovery are 30

Sterilization

Fermentation

Cell Separation Intra-Cellular Extra-Cellular

Cell Disruption

Product Recovery Insoluble Soluble

Solubilization

Refolding

Concentration

Purification

Figure 4: Sequencing of process steps for a recombinant protein process 31 required. If the product is soluble then the next steps are the same as for an extracellular product after Cell Separation. Whereas, for an insoluble protein product after Product

Recovery the next steps are: Solubilization, Refolding, Concentration and Purification.

It should be noted that the scheme described above for sequencing process operations is applicable to a wide variety of bioprocesses. Most importantly, it provides a clear algorithmic basis for the knowledge-based system to solve the sequencing problem. One way to express the sequencing algorithm is to represent the sequencing or synthesis constraints as rules. Table 1 shows the rules applied to the flowsheet shown in Figure 5 for an intracellular protein process.

3.3. KNOWLEDCE OF UNIT OPERATIONS

After sequencing the bioprocess flowsheet, one needs to determine appropriate unit operations to really carry out the process steps. In Figure 6, knowledge about the necessary unit operations are represented as a classification hierarchy. At the first level of the classification hierarchy (read from left to right) the process steps referred to in the previous section are represented. The second and third levels in the hierarchy contain the appropriate unit operations to carry out the process steps. The task of determining the unit operations applicable for a given process step can be stated as a selection problem (Gandikota et al.,

1991). For a Cell Separation process step. Centrifugation, 32

^Sterilizaton ^ ^-fermentation ^ ell Separation J

f e ll Disruption ^

Refolding Solubilization ^ ------froduct Recovery ^

^ Concentration^ ^«^Purification ^

Figure 5: Bloprocess for an intracellular protein 33

Process Steps Unit Operations

Thermal Media Sterilization Membrane Chemical

Mechanical Agitation Fermentation Air Lift Immobilized Cell

Centrifugatio^ Microfiltration Filtration Cell Separation Filtration Sedimentation

Homogenization Ball Milling Cell Disruption Enzyme Treatment Chemical Treatment

Bio Operations Extraction Precipitation Product Recovery Adsorption Distillation Absorption

Reverse Osmosis Concentration Evaporation

Ion-Exchange Ultrafiltration Gel Permeation Chromatography Affinity Purification Electro Dialysis HPLC Crystallization Immuno-Affinity

Drying Spray Drying Polishing

Figure 6: Hierarchical Ciassification of Bio-Process Steps and their Unit Operations 34

Table 1. Synthesis constraints on bioprocesses

If the process is recombinant protein Fermentation Then the first step is Sterilization of the medium

The output of Sterilization is the input of Fermentation

The output of Fermentation is the input to Cell_Separation

If the product is extracellular

Then the output of CelLSeparation is the input to Concentration

If the product is intracellular

Then the output of CelLSeparation is the input to Cell_Disruption

The output of CelLDisruption is the input to Product_Recovery

If the product is intracellular and soluble

Then the output of Product_Recovery is the input to Concentration

If the protein is intracellular and insoluble

Then the output of Product_Recovery is the input of Solubilization

The output of Solubilization is the input of Refolding

The output of Refolding is the input of Concentration

The output of Concentration is the input of Purification

Microfiltration, Ordinary Filtration, and Sedimentation are the available unit operations.

The applicability of these Cell Separation unit operations depends on the actual product to be

made from the process. 35

For each unit operation, selection rules are represented in the classification hierarchy. Most

of these selection rules are taken from the literature (Asenjo, et al., 1989), while the rest are

developed anew. See Appendix x for the various selection rules.

The selection rules for membrane for membrane filtration and immobilized cell fermentation are:

■ If large capacity is required & primary recovery is involved & the product is intracellular

Then membrane filtration is selected with a confidence value=8

™ If a high cell density is required & the product is a primary metabolite & the product is extracellular & large production capacity is required & a continuous operation is desired

Then immobilized cell bioreactor is selected with a confidence value=9

The confidence values in the rules are a measure of the certainty with which the operation can

be selected given the preconditions in the rule. The confidence value has a 0-10 range; 0 meaning ‘absolutely rejected’ and 10 meaning 'absolutely selected’.

The rules are then tested by an algorithm called establish-refme. If a unit operation is

rejected then all its subtypes are also rejected. The computational advantage of the

establish-refine procedure can be seen, for example, if for a given product Chromatography

is rejected for the Purification then none of its five sub-types need to be considered further.

Thus the establish-refine procedure offers a great deal of computational efficiency by

pruning portions of search space involved in the selection problem.

After applying establish-refine procedure the selected objects are tested for interaction and

refinement constraints which are described in the following sections. The solution algorithm

for solving these constraints involves creation of an Object-Constraint Network (OCN) and

then propagating the constraints. 36

3.4. KNOWLEDCiE OF INTERACTIONS AMONG UNIT OPERATIONS

So far, the synthesis problem and selection problems have been described in a manner that selection follows synthesis sequentially. However, this is not always the case. In practice, a designer may postpone a synthesis decision until a selection can be made first. If a selection cannot be made for some reason then the synthesis cannot proceed. A selection failure may occur if there are interactions that preclude one unit operation from being selected in preference to another. Examples of these interaction constraints are as follows:

A negation constraint on the sterilization operations is: “If thermal or membrane sterilization is selected then chemical sterilization is rejected.” This is because chemical sterilization contaminates the final product and causes problems in downstream processing, and generally is not used in industry.

An example of an elimination constraint on the Cell Separation operations is: “If centrifugation is rejected because of low particle density then sedimentation is also rejected. ” This is because sedimentation requires higher particle density than centrifugation does.

An addition constraint for Product Recovery is: “If precipitation is selected then filtration for precipitate recovery is selected.” This is because filtration is usually used to separate the precipitate from the liquid.

An example of reinforcement constraint for Fermentation is: “If airlift bioreactor is rejected then fermentation with mechanical agitation is selected.” This is because there are basically only two types of fermentors providing adequate mixing. If one is rejected, the other one must be selected.

These interaction constraints make the system more effective in selection. They help to eliminate the less desirable choices and prune the potential alternatives until feasible ones 37 remain. The addition constraints are also used in flowsheet refinement discussed in the following section.

3.5. REFINEMENT OF A FLOWSHEET

Refining the flowsheet is necessary if the solution of the unit operation selection problem causes interactions in the flowsheet structure. The refinement constraints arise both from the selection of unit operations and their sequencing and as such an interaction is at times unavoidable. However, a significant computational advantage is gained if the refinement constraints can be postponed until after the synthesis and selection problems are solved successfully.

Table 2 lists some of the examples of refinement constraints. The refinement constraints add operations to process input streams, output streams or between process steps. These operations are usually different from the unit operations shown in Figure 6 and make improvements to the process steps.

3.6. KNOWLEDCE-BASED SYSTEM PROTOTYPING; AND TESTIN(i

The bioprocess design problem solving described so far has been implemented in Common

Lisp on a VAX station 3100. To test the knowledge-based system, a design problem reported in the literature (Petrides et al., 1989) has been chosen. The goals for testing are as follows:

®To carry out synthesis and selection using the approach described for a rDNA protein product to generate the same flowsheet reported in the literature; to test if enough and correct knowledge have been given to the system. The case study is described in next section.

9 To find whether there are other feasible flowsheets for the same rDNA product.

Intuitively, there should be more than one solution for the process synthesis problem.

The knowledge-based system generated 12 flowsheets for the recombinant protein studied.

Figure 7 shows the flowsheet generated by the system that is very close to the one reported by IN i Table 2. Some examples of refinement constraints used in bioprocess synthesis

If a large-volume fermenter is required for the process, then an inoculum fennentor

should be added upstream with the output of the inoculum fermenter flowing into the

input of the production fermenter.

If the sterilization and fermentation have to be operated in batch or semi-batch mode

and downstream operations are operated in continuous mode, then a holding tank is

required at the interface between (semi) batch and continuous operations.

If refolding is selected, then diafiltration for salt removal is selected.

If chromatography is selected, then ultrafiltration to concentrate the feed is selected.

For efficient ultrafiltration any aggregate protein in the feed stream should be

removed by filtration first.

Petrides et al. (1989). We successfully accomplished both the goals, as the system generated

12 potentially feasible flowsheets for making the same rDNA product. However, additional simulation and optimization studies are required to confirm feasibility of these flowsheets.

3.7. CASE STUDY

To test the knowledge-based system, a design problem reported in the literature (Petrides et al., 1989) has been presented to SELECTRIX (see Figure 8 for problem statement). To sol ve the design problem, SELECTRIX will first choose the necessary processing steps for an intracellular insoluble product from the process constraint diagram (Figure 4).

The preliminary selection of unit operations for each of these process steps is done next by applying the establish-refine strategy on the classification hierarchy shown in Figure 6. The result of preliminary selection, at the end of establish-refine problem solving is shown in 39

Feed Thermal ImümE Mech. Agitation ( Sterilization > GFermentor

Homogenization Centrifugation # 0 #

^ Centrifugation^ ----- Solubilization Tank

Refolding cUltrafiltration > -< Tank

Ion-Exchange Product cChromatography

Figure 7: Final refinement of flowsheet # 1 in Figure 11 using refinement constraints in Figure 12 40

Psslon ProbLsni A plant that can supply 30,000,000 pigs per year with pGH has to be designed. Since one dose is required per pig-life and the quantity of one dose is 200 mg, this corresponds to a production of purified PGH of 6,000 kg/yr. Assuming a recovery yield of 25% the amount of pGH that must be produced by the fermentor is 24,000 kg/yr.

Design Basis

Market Supply : 30,000,000 pigs/yr Number of doses per pig-life : 1 Amount of pGH per dose 200 mg Production of purified pGH : 6,000 kg/yr Recovery Yield 0.25 Production of pGH by the fermentor : 24,000 kg/yr

(pGH is a porcine growth hormone which is manufactured from E.co//modified by recombinant DMA techniques)

Figure 8: Design problem statement 41

Figure 9. The boxes in Figure 9 represent the processes selected using the process constraint diagram (Figure 4). The entries below the boxes are the selected unit operations for the process steps which are also the tip nodes in the hierarchy of Figure 6. The preliminary selection gives rise to 24 potential process flowsheets out of a total of 17,254 possible flowsheets. The total possible number of process flowsheets is calculated by considering all the combinations of unit operations (tip nodes in Figure 6) connected with one another in a design requiring all the process steps (Figure 4).

During critical selection SELECTRIX creates OCNs for each process step to evaluate the relational constraints between the objects shown in Figure 10. For example, a reinforcement type relational constraint is set up between the nodes representing thermal sterilization and membrane sterilization for the OCN corresponding to sterilization process step. The knowledge used in setting up the reinforcement constraint is stated as; If thennal sterilization is not possible then use membrane sterilization. This is interpreted as: If thermal sterilization is rejected then membrane sterilization is selected which is a reinforcement type relationship between the two objects. These OCNs are processed using the consistency check algorithms.

At the end of critical selection several unit operations selected by preliminary selection task are rejected. For example, air lift fermentation is rejected because of the negation type constraint that says: for a medium scale fermentation mechanical agitation is preferable to air lift fermentation.

Finally, about 12 alternatives remain for the pGH protein processes. Three of these processes are shown in Figure II. The flowsheet shown in Figure 7 is a refinement of flowsheet# 1 in

Figure 11(a) after applying the refinement constraints shown in Figure 12. It is to be noted that the refined flowsheet in Figure 8 is same as the one proposed by Petrides, et al., 1989 based on pilot plant studies.

For discovery purposes a novel flowsheet is all that is necessary. The discovered flowsheet can be further refined by adding synthesis constraints on processing considerations. 42

CSterilizaton > Fermentation } -W Cell Separation ® Thermal Sterilization ® Mechanical Agitation @ CentrituoMm # Membmae^erlllzation ® A iü M Cell Disruption

® Homoamization

Refolding ^ ^ Solubilization ^ ^Product Recovery ^

e Centüfucmîim ^ Concentration^ ------Purification J ® Ultrafiltration ® ion-Bxchanae Chromatoaraohv © Gel Permeation Chromatoaraohv © Affinity Chromatography

(At this stage, the system has finished hierarchical classification and has ruled out about 17.232 process flowsheets. There are still 24 process flowsheets to be considered.)

Figure 9: Results of Preliminary Selection for rDNA Protein Synthesis 43

Rairrforcament Therm al M em brane # If thermal sterilization is rejeaeo then Sterilization Sterilization memorane stenlizatlon Is seieaed

@ If thermal or memorane sterilization is Negation Negation selected then chemical sterilization is rejected

Reinforcement Centrifugation 1 ------Microfiltration e If centrifugation is rejected then microfiltration is selected

0 If centrifugation or microfiltration Is Bimination Bimination rejected then sedimentation Is rejected

Reinforcement ' For a medium scale fermentation Mechanical y if airlift ferm entation is rejected A gitation JL then mechanical agitation is Negatton selected

If mechanical agitation fermentation is selected then airlift fermentation Is rejected

Figure 10: Object-Constraint Networks (OCN) for Selection of Alternative Unit Operations 4 4

3**^ ------^ Carrglftjgaflofi Ç HomogenizaTlon

flefaaing c TanK )— -G=™ HD— -C “;sss J f Product

/ /Semtjrano- \ iViecn. Agitation ^ V 3tafltfzaii6o .J------®*\Fefmentoff J~ CarmtlUgation ^

Z' Refolding ^ _ r ~ \ V TanK ) Ultrafiltration 1------~

? Product

Solubilization Cantrlfugailon ^«g-- ...... ^ Homogenization TanK

Refolding TanK Ultrafiltration

Product

Figure 11: Alternative flowsheets for case study 45

Refinement Constraints for rDNA Process F!owsheet#3 ® Inoculum Fermentor Constraint: for large production fermentor © Holding Tank Constraint: for upstream-downstream decoupling © Refoldino-Solubilization Constraint: salt removal for efficiencent refolding ® Ultrafiltration-Refoldina Constraint: aggregate protein removal for efficient ultrafiltration

0 lon-Exchanoe-Gel Permeation Constraint: for achieving desired purification

Figure 12: Refinement Constraints for rDNA Flowsheet#! in Figure 11 46

Determining whether a unit operation needs extra processes, say refining a fermentors by

adding feed pre-heaters and a pH control system, requires information about specific process conditions (temperature of feed, pH of feed, fermentation kinetics, etc.), in addition to selection and synthesis knowledge of the unit operations. Since SELECTRDC is not equipped with process models, it cannot generate such information. Using data from pilot plant testing and process simulation the flowsheets generated by SELECTRIX can be further improved.

3.8. COMPLEXITY OF BIOPROCESS SYNTHESIS PROBLEM SOLVINC

If for a given product P all four of the Cell Separation unit operations are applicable, then there wUl be at least four different process flowsheets for making that product. Similarly, for

Product Recovery at least five different unit operations are available. If for product P all five unit operations are applicable then we have now 4 (Cell Separation unit operations) x 5

(Product Recovery unit operations) = 20 different flowsheets. From Figures 4 and 6, if P is an extracellular protein then there are 432 different flowsheet alternatives, and if P is an intracellular and soluble protein then there are 15,120 different flowsheets So, in general, if there are n process steps in making P and there are m unit operations available for each process step, then there will be m different process flowsheets for making the same product.

It is illustrative to compare the above complexity analysis with that of two other analyses reported in the literature. The complexity of search in BioSep (Siletti, 1989), a knowledge-based system for protein process synthesis, in terms of the total number of potential designs to be explored by the system is given as, r\ g, where qi is the number of f-I unit operation choices for the /'*’ processing step. In the worst case the number of alternative designs explored by BioSep is 108. Basically, the knowledge-based approach leads to a lower complexity than any of the others. Asenjo et al. (1989) have stated that: “The overall downstream process synthesis problem in biotechnology does not have a strict combinatorial 47

nature. Only the high resolution purification stage within the purification subprocess, where

more than one high resolution purification steps and several alternatives in different order

combinations can be used, can lend itself to be solved as a combinatorial problem.”

Apparently, the magnitude of a combinatorial problem can be drastically reduced if more knowledge is available.

3.9. IMPROVEMENTS TO THE KNOWLEDfJE-BASED SYSTEM

Besides integrating simulation capabilities into the system, there are other ways the knowledge base of the system can be improved significantly. This involves fine tuning of synthesis, selection and refinement constraints over and above the ones already represented

in the knowledge base. This is similar to stating that the expert gets better with experience

and gains more knowledge as a result.

Currently, the system does not have any knowledge about bioproducts. It has only knowledge about process steps and unit operations. For instance, there are several different alternative flowsheets available in the literature for the manufacture of ethanol. These processes differ from each other in both process layout and in starting materials. This demonstrates how diverse process synthesis is in practice.

A currently running industrial process has really evolved over decades of fine tuning by designers and plant engineers and thus, it seems impossible to instruct a system how to evolve a preliminary flowsheet so that it satisfies industrial standards. Instead of trying to add more constraints to the existing knowledge base, it seems to us, the better approach is to provide flowsheets known for particular products to the system. These flowsheets can be

used as references in developing new flowsheets. Then the system can be equipped with

adaptation strategies that make incremental changes to an existing flowsheet just the way process plants are retrofitted periodically for taking advantage of current technologies for energy and raw material savings. 48

For the selection and synthesis problems described so far, Chapter 4 describes inductive learning of constraints; Chapter 5 presents a program for creating classification hierarchies; and Chapter 6 describes a case-based reasoning system as an alternative to the task-based framework. CHAPTER IV

LEARNING PROCESS SYNTHESIS CONSTRAINTS AND

GENERALIZING THEM INTO HEURISTICS

The selection and synthesis tasks described in Chapter 2 involve application of constraints.

How can one explain the various selection and syntliesis constraints? What is their source?

How are they formulated? Similar questions can be raised about the heuristics for selection and synthesis.

The answers to these questions may lie in the way designers have learnt their process synthesis concepts. For instance, a designer might visit a penicillin production plant and notice that the femientation products are first filtered, cooled and then purified in the downstream purification processes. The designer might state these observations as: “To carry out the downstream purification process efficiently first remove any oversized particles in the fermentation product. Also, if the fermentation process is generating heat, the product needs to be cooled before purifying.” Note that, ‘oversized particles’ and

‘fermentation process is generating heat’ are observable but are not usually represented in a flowsheet. The question then is, how can a computer program without sensory perception learn about constraints in the design of a flowsheet?

One way to generate constraints algorithmically is to use default logic, where it is assumed that the input is correct. For example, when a program is presented with a flowsheet

49 50 containing crystallization for purifying penicillin product, it proposes the following selection constraint:

“If the product is penicillin then Crystallization is selected for Purification step.”

The program can also annotate the constraint with the appropriate reference such as:

Reference: “Penicillin purification process of Gist-Brocades (From Hersbach, 1988).”

The program generates a constraint on synthesis after noting that before crystallization

(purification), there are treatment (purification) and carbon filtration (separation) operations as:

“If the product is penicillin & the purification step is crystallization then the input to crystallization is processed first in carbon treatment and next in carbon filtration”

In a refinement constraint the program proposes a process step between two steps such as:

“If the product is penicUlin & the purification step is crystallization & the first concentration step is filtration & the second concentration step is drying then the output of first concentration step (filtration) is washed before feeding to the second concentration step

(drying)”

Reference: “Penicillin purification process of Gist-Brocades (From Hersbach, 1988).”

In the rest of this chapter an inductive approach for leaming process synthesis constraints is presented. This method resembles the exemplar-based inductive approach to machine leaming. However, the method differs by using only positive exemplars for leaming. This is a valid assumption because the input is a flowsheet that works in the real world.

4.1. REPRESENTATION OF PROCESS SYNTHESIS CONSTRAINTS

There are four basic types of constraints in the process synthesis problem solving described in Chapter 2: 51

® Selection Constraints

If product is P & process step is Q

Then unit operation R is selected to carry out Q

® Synthesis Constraints

If product is P & unit operation Q is selected/rejected

Then unit operation R is selected/rejected

® Interaction Constraints

If product is P & process step is Q

Then process step R is at the input/output of process step Q

® Refinement Constraints

If product is P & process steps Q and R are adjacent

Then process step S is created between Q and R

Here a unit operation is a subtype of process step. So in the above rules these terms are interchangeable for either generality (if ‘process step’ is used) or specificity (if ‘unit operation’ is used). Also in a refinement constraint, when process steps Q and R are adjacent, at least one output of Q is connected with at least one input of R. And when process step S is created between process steps Q and R, the output of Q is connected with S and the input of R is connected with S.

An alternative representation for each of these constraints, without the ‘if-then’ semantics, is preferable; for a pattern recognition algorithm to operate on these representations easily. So the redundant ‘if-then’ semantics are eliminated. In this alternative representation, prefix

“p ” — for process steps— and prefix “u” —for unit operations—will be used.

The following n-tuple, containing a list of; the type of constraint, product (P), process step

(Q) and unit operation (R), is equivalent to the mle form of a selection constraint: 52

In the following n-tuple an interaction constraint between process-steps/unit-operations Q and R for a product P has been specified for four different interactions already defined.

<{ addition, negation, reinforcement, elimination ), P , p/u-Q, p/u-R>

For example, means: if unit operation Q is selected then unit operation R is rejected.

The n-tuple specifying a synthesis constraint for a product P and adjacent process-steps/unit-operations Q and R is:

Note that in the above representation the connectivity is implicit in the definition of adjacent processes—the output of Q is connected with the input of R. Thus the n-tuples provide an efficient way of encoding appropriate information used for formulating various constraints.

Sometimes it may be necessary to include additional information in the constraints like substance, composition, process conditions, temperature, pressure, etc. For example, two process steps are connected only when a particular substance has to flow between them. In such cases we just extend the n-tuple as:

A substance can exist in four different states: gas (g), liquid (1), solid (s) and ionic (i) (there

could be some exceptions, such as plasma and elementary particles that do not fall into this

four-level substance classification). Also, a substance can be pure or a mixture. For a

mixture, the various components wUl have compositions expressed in some units like: mole

fraction, weight percent, mole percent, etc. So in general, a substance can be represented as

another n-tuple containing state and composition information. 53

For example, to represent a mixture A of 30% by weight cells (s), 69% by weight water (1)

and 1% by weight carbon-dioxide (g):

Or a pure substance say oxygen (g):

Similarly, the temperature and pressure information of a unit operation R is represented as;

Some remarks are necessary regarding the appropriate substance and process condition

information to be represented in constraints.

® The substances represented are only those transformed by the unit operations or process

steps. Sometimes unit operations may involve substances that are not part of the

raw-material or product specification of the synthesis problem. For instance, an absorber

may transform the concentration of carbon-dioxide in an input gas mixture using

-ethanol-amine as a solvent. In such case, the process synthesis problem solving does

not involve synthesizing mono-ethanol-amine unless it has been stated as a separate goal.

Similarly, synthesis of catalysts for reactions, chemicals used in extraction, precipitation,

etc. are not part of the process synthesis problem solving.

® The state of a substance becomes important if the unit operations concerned can operate

only on an input whose state is a combination of gas, solid, liquid or ionic. Certain unit

operations transform substances from one state to multiple states, such as an expansion valve

whose output typically contains the same substance in both gas and liquid states. In such 54 processes, each output is restricted to a single state. So the representation of an expansion valve will have two outputs: one in the liquid state and the other in gas state.

® In addition to chemical substances, unit operations may utilize utilities like steam, water, electricity, etc. that are assumed to be provided with the unit operations. The processes being created need not generate utilities.

® These constraints are only meant to give some idea about the considerations that go into proposing process flowsheets. So the aforementioned examples are only some of the obvious ones. Also, the proposed representation for constraints should be interpreted flexibly rather than rigidly, so as to accommodate any novel constraints that may occur in a particular synthesis problem. For instance, in a particular process if the color of substance A has to be

‘red,’ then we may represent it analogous to the composition as:

With this background, an algorithm to learn the selection, synthesis, refinement and interaction constraints is described in the following section. For the sake of simplicity in presentation, only the unit operations and their connections are considered. One can extend the algorithm easily to incorporate constraints on substances, compositions and process conditions by providing the appropriate information in the flowsheet, beside processes and their connections.

4.2. THE CONSTRAINT LEARNING ALGORITHM

The input to the constraint leaming algorithm is a flowsheet. The output from the leaming algorithm is a set of constraints in the form of n-tuples. The algorithm, basically, conducts a search in the entire flowsheet and generates constraints automatically. While the search is exhaustive, constraints generation is heuristic. The algorithm can be shown to be tractable. In general, if a flowsheet has n operations, then the different types of constraints generated from the flowsheet are as following: 55

Maximum number of selection constraints generated = n

Maximum number of interaction constraints generated = k * n (where k is the average number of altemative unit operations for each process step)

Ma ximum number of synthesis constraints generated = n-1

Maximum number of refinement constraints generated = n-2

The methods involved in generating the various constraints will be described now.

4.2.1. (Generating Selection Constraints

If unit operation P is present in the flowsheet then create a selection constraint by looking up the corresponding process step Q in the hierarchy in Figure 6 as:

If unit operation P is present in the flowsheet but there is no process step corresponding to P, then ask the user for the process step P.

4.2.2. (Generating Interaction Constraints

If unit operation P for process step Q is present in the flowsheet then do the following:

1. Find all the available unit operations S (from the classification hierarchy) other than P for process step Q

2.For each unit operation T in S do the following:

2.(a) If T is present in the flowsheet then create an addition constraint as:

And remove any constraints of the form or

from the knowledge-base. 56

2.(b) If T is not present in the flowsheet and there is no constraint in the knowledge base as:

or then create a negation constraint as:

4.2.3. (Generating Synthesis Constraints

If unit operation P, carrying out process step Q, is adjacent to unit operation R, carrying out process step S, such that the output of Q is flowing into the input of S, then create a synthesis constraint as:

If either of the process steps are unknown then create a synthesis constraint as any one of the following: when p-Q & p-S are unknown: when p-S is unknown: when p-Q is unknown:

4.2.4. (Generating Refinement Constraints

If unit operation P, carrying out process step Q, is present between unit operations R (process step S) and T (process step U) then create a refinement constraint as:

If a process step is unknown are unknown then create one of the following constraints: unknown p-R: unknown p-S: 57

unknown p-T;

unknown p-R and p-S: unknown p-R and p-T;

unknown p-S and p-T: unknown p-S, p>-R and p-T:

4.3. LEARNING HEURISTICS FROM CONSTRAINTS

Process design heuristics are based on the generalization of many observations (King, 1974, pp.l7). The generalization proceeds by inductive logic which is ‘reasoning to a conclusion

about all members of a class from examination of only a few members of the class.’ So the generalization involves going from the particular to the general. The constraint generation methods described earlier, are based on a particular flowsheet only. Generalization methods to generate heuristics from selection, interaction, synthesis and refinement constraints are described now.

4.3.1. (Generalizing Selection Constraints for a Product

Suppose, there are n different flowsheets to make a product P in the knowledge base. Also,

suppose there are m < n selection-constraint n-tuplès containing process step R and unit

operation S as:

flowsheet#l:

flowsheet#2:

flowsheet#m:

Then we can replace all these n-tuples, with one n-tuple that says: In m out of n flowsheets

unit operation S can be selected for process step R,’ as: 58

«selection, product-P, p-R, u-S, m ;n» where « » represent a heuristic and m:n means m out of n flowsheets.

For explanatory purposes, it is better to retain the constraints as well as the heuristics in the knowledge base; and a pointer from a heuristic to the constraints used to generate the heuristic serves as justification.

We can extend this reasoning for a class of products as well. Suppose we have several selection heuristics, for different protein products PI, P2, P3, etc., as:

«selection, PI, p-R, u-S, m lin l»

«selection, P2, pi-R, u-S, m2:n2>>

«selection, P3, p-R, u-S, m2:n3>>

«selection, PX, p-R, u-S, m x:nx»

We can replace these X n-tuples with one n-tuple as:

« where mi:ni = (ml/nl + m2/n2 + m3/n3 ... + mx/nx)/X.

4.3.2, (lenerali/ing Synthesis and Refinement Constraints

The procedure for generalizing the synthesis and refinement constraints for a product P and a class of products is going to be exactly the same as for selection constraints.

4.3.3. Cîeneralizing Interaction Constraints

The learning algorithm generates two types of interaction constraints: addition and negation.

The generalization of these two types of constraints produces heuristics of form: 59

«addition, product-pl, u-Q, u-R, m l»

«négation, product-pl, u-S, u-T, m 2» when there are ml addition constraints and m2 negation constraints.

By adapting the cross-over technique of genetic algorithms (see Appendix C), the reinforcement and elimination heuristics can be generated from the addition and negation heuristics. The reinforcement heuristics are created by:

(i) reversing the negation heuristics eg: «negation, product-pl, u-S, u-T, m 2 » ==> «reinforcement, product-pl, u-T, u-S m 2 »

(ii) crossing-over addition and negation heuristics eg: «addition, product-pl, u-X, u-Y, k l» cross-over-with «negation, product-pl, u-X, u-Z, k 2 » ==> «reinforcement, u-Z, u-Y, (kl + k2)», if there is no elimination heuristic between u-Z and u-Y in the knowledge base.

By crossing-over negation heuristics of the form:

«negation, product-pl, u-S, u-X, k l» cross-over-with «negation, product-pl, u-S, u-Y, k 2 » ,

an elimination heuristic can be created as:

«elimination, product-pl, u-X, u-Y, (kl + k2)»,

if there is no negation or reinforcement heuristic between u-X and u-Y in the knowledge

base.

The elimination constraints occur when two unit operations are rejected; so there should be

no negation or reinforcement constraints relating the two unit operations in the 60 knowledge-base, such as: «negation, product-pl, u-X, u-Y, k n » , for the elimination constraint to be created.

The generalization across different products is possible in the same way as selection constraints. For generalization of addition heuristics between u-Q and u-R, suppose for X different protein products PI, P2, P3...PX, the addition heuristics are as following:

«addition, PI, u-Q, u-R, yl:n>>

«addition, P2, u-Q, u-R, y2:n>>

«addition, P3, u-Q, u-R, y3:n>>

«addition, PX, u-Q, u-R, yx:n>>

The addition heuristic to replace all of the above is:

«addition, protein, u-Q, u-R, z:n>> where z:n = (yl/n + y2/n + y3/n ... + yx/n)/X

Similarly the negation, reinforcement and elimination heuristics are expressed.

4.4. VALIDATION OF HEURISTICS

The learning algorithm partially validates heuristics by the numerical rating assigned to each heuristic based on the frequency of occurrence in the flowsheets. This is only a partial validation because the frequency of occurrence measure does not take into account subjective judgments involved in specifying heuristics. For instance, consider the following n-tuples generated algorithmically:

«selection, pGH, fermentation, mechanical-agitation-fermentor, 3 :5 »

«selection, pGH, fermentation, air-lift-fermentor, 2 :5 » 6 1

These heuristics imply that a mechanical agitation fermenter was selected, in preference to

an air lift fermenter, for carrying out the fermentation process step to produce pGH product.

However, it may be possible that the flowsheets used as input to the program are from

different time periods. If flowsheets containing mechanical agitation fermenter are, say, a

decade older than those containing the air lift fermenter, then it is more likely that for current economic conditions air lift fermentation is preferable. Thus, there could be criteria extraneous to process synthesis problem solving described here like cost, maintainability, etc. that the program did not consider in generating the heuristics.

4.5. APPLICATIONS OF THE INDUCTIVE LEARNING ALGORITHM

Process designers can derive significant advantages from the inductive learning program. It

is made possible by a set of queries which the designers can ask via the program interface.

The program interface interprets each query and applies the necessary algorithm to answer

the query. The syntax of these queries and their processing is described here.

4.5.1. Can unit operation be used for process step to make product

?

A specific query of the above type is, for example, “Can unit operation crystallization be

used for process step purification to make product penicillin?” To answer the query, the

algorithm searches in the constraint knowledge-base for the necessary inductive evidence to

relate the three concepts: u-crystallization, p-purification and product-penicillin. If there is

enough evidence to relate the three concepts as:

«selection, penicillin, purification, crystallization, 3 :5 » ,

then the query will be answered positively. 62

4.5.2. What is the most frequently used unit operation for process

to make product

?

This query is a slight variation of (a), by not providing the unit operation in the query.

The program is required to find all the unit operations used for process step

to make product

, and rank them according to their frequency of occurrence in the constraint knowledge-base. For example, if the program is asked, “What is the most frequently used unit operation for process to make product ?,” then it may generate the following constraints ranked by their frequency of occurrence in the constraint knowledge-base.

«selection, ethanol, separation, centrifugation, 5:10»

«selection, ethanol, separation, precipitation, 2:10»

«selection, ethanol, separation, filtration, 2:10»

«selection, ethanol, separation, membrane, 1:10»

The answer to the query then will be: ‘centrifugation.’

The following queries are different from the above in the sense of answering these involves discovering arithmetic relationships among the variables mentioned in the queries.

4.5.3. What is the arithmetic relationship between variable and variable ?

The program is expected to answer the query using one of the following arithmetic relationships: vl is less than (<) v2 vl is greater than (>) v2 vl is less than or equal to (<=) v2 vl is greater than or equal to (>=) v2

V1 is equal to (=) v2 63

Suppose, numerical constraints for the inlet temperature of hot fluid (TH), and outlet temperature of cold fluid (TC) exchangers are available in the flowsheets as:

After processing the above data, the program formulates a valid arithmetic relationship between the two variables (by simply finding the largest TH and largest TC to conclude that the largest TH is larger than the largest TC) as;

T H > T C

Suppose, an extra constraint is presented as: , the program revises the previously formulated arithmetic relationship as:

TOH >= TOC

Evidently, the computation here is very simple. However, answering the query may require processing large numbers of data. Sometimes the query may be open-ended, by asking the program to find a previously unknown relationship between two variables.

4.5.4. ÎS there a precondition for processing feed from process to process ?

In the introduction of this chapter an example of a heuristic is given as: "If the fermentation process produces heat then the product should oe cooled before purification step.” This heuristic can be generated by asking the program: “Is there a temperature precondition for processing feed from fermentation to purification?” Suppose, the constraint 64 knowledge-base has the following constraints relating feed temperature (TF) to purification from fermentation:

After doing induction on the temperature data, the program may conclude that: “If the feed temperature to purification from fermentation is greater than 180 degrees F, then the feed has to be preprocessed before purification in a cooler.”

4.6. SUMMARY

An algorithm to automatically acquire the various constraints for selection, synthesis, interaction and refinement of process flowsheets is described. Although, the algorithm is not implemented fully, by its functional specification it will be able to generalize the constraints it has acquired in the constraint knowledge base into useful heuristics. The generalization process is based on induction. To enable interaction of human designers with the program, a query-based interface is also described but not implemented yet. Using the queries designers can question about the constraints and heuristics acquired by the program in a convenient way. The constraint representation and the query-based interface, illustrated by examples, is flexible enough for alteration and extension. CHAPTER V

LEARNING CLASSIFICATION HIERARCHIES

5.1. INTRODUCTION

In Chapter 2, hierarchical classification was used for the selection task of process synthesis problem solving. The various selection categories—process steps and unit operations— have been arranged in a classification hierarchy where the most general categories are at the top or leftmost and the most specific categories are at the bottom or rightmost. The primary advantage of hierarchical classification was shown to be the pruning ability using the establish-refine algorithm. That is, if a general category cannot be selected then all of its children need not be considered for selection any further.

The various classes in a classification hierarchy have the following basic relationships between them: “type-subtype,” “system-subsystem,” “function-subfunction,”

“part-subpart,” etc. The classification relationships are due to domain specific knowledge such as— “A vacuum pump is a type of reciprocating pump,” “The coolant pump is a subsystem of the reactor,” “A malfunction in the reactor suggests a malfunction in the coolant pump,” “A disk drive is a part of a personal computer,” etc.

Domain theory plays a major role in the representation of concepts as a classification hierarchy. Hence, generation of classification hierarchies automatically using a program is so far feasible only with the help of domain-specific models. For instance, in their

6 5 6 6

Functional Representation System, Chandrasekaran and Sembugamoorthy (1986) have utilized a ‘deep model’ to represent a device and used the deep model as a basis for creating a malfunction hierarchy of the device faults. The deep model of a given device can be viewed as domain theory about the structure, function and behavior of that device. The classification hierarchy compiled from such a deep model can be viewed as an emergent hierarchicalization of ‘‘function-subfunction,‘' “part-subpart” and “system-subsystem” relationships represented in the deep model implicitly.

While the deep model approach pioneers the application of domain theory to learning hierarchical classification, it also encompasses greater complexity than is usually required for generating classification hierarchies in practice. Human experts can seldom describe an adequate deep model to automate the construction of classification hierarchies. A strategy that works best for both the expert and the Icnowledge engineer is to first identify all the classes and then organize them in a classification hierarchy by trial-and-error. The role of the human expert is crucial here because of the domain specificity of the relationships among classes in the classification hierarchy. The resultant classification hierarchy may not be optimal, in terms of computational efficiency, although it is stül, unquestionably, a better representation compared with a knowledge-base of unorganized rules.

Sometimes even a human expert cannot decide on the best classification hierarchy. This happens when the concepts could not be categorized into any of the previously stated classification relationships such as “type-subtype,” “system-subsystem,” etc. because the necessary domain theory is lacking. The best course of action for a knowledge engineer in such a case is to adopt the least-computational-complexity approach. This leads to classification hierarchies with the least number of interactions among classes. The knowledge engineer’s strategy here complements domain theory, thus paving way for the constmction of acceptable classification hierarchies. 67

For situations where domain theoiy is lacking, to generate a classification tree, some theorems without proof, alternatively called conjectures, are presented in this chapter. Based on these conjectures, a program called HC1 has been developed for generating classification hierarchies. The input to the program is composed of necessarily, (i) a set of concepts that need to be classified, (ii) a set of all the necessary and sufficient evidences for ruling out each concept, and optionally, (iii) a set of domain-specific relationships among the concepts. The output from the program is a hierarchical classification of the concepts with the property of:

‘ifa concept at a higher level is ruled out then all of its children at lower levels are ruled out.’

In the following sections, the axioms and conjectures are first presented. Next, an algorithm and a program for generating classification hierarchies are described. Finally, the various concepts are illustrated with classification example drawn from process domain.

5.2. AXIOM #1: EVIDENCE-CLASS TABLE

An evidence-class table contains entries of classes and the evidence for each of the classes.

For example, consider the following:

Table 3. Evidence-Class Table

Evidence Class el e2 e3 e4 e5 e6 T T TT T T cl

T • n/a •• F •• n/a n/a •• T c2 ••

In Table e I, e2, e3, e4, e5, e6...are evidence to be considered for establishing classes c I, c2... 6 8

The values of evidence are given by-T= true, F= false, n/a= not applicable. Note that, except for n/a the rest of the values are also used by the structured pattern matcher. The value n/a is used when it is required to explicitly show in the evidence-class table that a particular evidence, say ek, has no influence in establishing a class, say cj.

5.3. AXl()M#i.l

Every class in the evidence-class table has an evidence set.

5.4. AXI0M#1.2

Evidence set can be a null set.

5.5. AXI()M#2: EVIDENCE SET OF A CLASS

The evidence set of a class is composed of those evidences whose values in the evidence-class table are not equal to “n/a” and are positive.

For example the evidence set of cl in the above table is:

E(cl) = (el=T, e2=T, e3=T, e4=T, e5=T, e6=T} or simply, E(cl) = {el, e2, e3, e4, e5, e6}

Wlicteas the evidence set of c2 is:

E(c2) = |el=T, e3=F, e6=T} (note that e2, e4, e5 are not members)

5.6. AXI()M#3: POSITIVE AND NEGATIVE EVIDENCE-SETS

A positive evidence set has elements each of which has a value equal to T. Similarly, each element of a negative evidence set has value equal to F. Given an evidence set as E(c2) =

(el=T, e3=F, e6=T), it can be expressed as a positive evidence set Ep(c2) = (el, e3’, e6}, where e3' = ~ ’e3 ; also, the equivalent negative evidence set E„(c2) = (e l’,e3,e6’}, where el' = ~’elande6'= "^e6. 69

5.7. AXI()M#4

“Class proximity” is defined for two classes based on the number of common elements in their evidence sets. class-proximity(A, B) = number of elements in E(A) Q E(B)

If class-proximity(A,B) is greater than class-proximity(A, C) then B is more likely to be the parent of A than C. Class proximity is a numerical value.

5.8. CONJECTURE#!: SUBCLASS RELATIONSHIP

If the evidence set of class X, E(X) is a super set of the evidence set of class Y, E(Y), then Y is a subclass of X.

5.9. CONJECTURE#!

Given that Y is a subclass of X and Z is a subclass of X, one of the following situations can occur, where —> stands for “a subclass of”:

If Y is also a subclass of Z then the hierarchy is

Y —> Z — > X

If Z is also a subclass of Y then the hierarchy is

Z —> Y —> X

If E(Z) = E(Y) then

{Y,Z}— > X and Y <—> Z where <—> means “a sibling of” 70

5.10. CONJECTURE# 3

Suppose a hierarchy is defined as following:

{Y.Z.A)—>X,

|B ,C } —>Y,

|D, E.F)—>Z,

{G ,H }—> A ,

Y <—> Z <—> A,

B <—> C,

D <— > E <— > F,

G < — > H

Say, we have a new class I, where does I belong in the above hierarchy?

One of the following situations can occur:

(i) I can be a subclass of a tip node

To decide this, we use “class proximity” criterion, that says the tip node T, whose evidence set E(T) is closest to E(I) based on the maximum number of common elements, is the superclass of I.

(ii) I can be a sibling of any node already existing in the hierarchy

To decides this, E(I) should be equal to any node T’s, E(T).

Singularity: If E(I) = E(X) where X is the top node, then I does not belong in the hierarchy with X as the root. 71

(iii) I can be a tangled subclass of two nodes

This can happen when there are two nodes, T 1 and T2, that have equal “class proximity ” and

I is a subclass of T1 as well as T2.

5.11. AXIOM #5: ESTABLISH-REFINE TABLE

An establish-refine table is composed of the following:

(i) explicit establish evidence patterns (ii) implicit establish evidence pattems (iii) explicit

reject evidence pattems (iv) implicit reject evidence pattems

An example:

If e 1= T & e2=F & e3=n/a then establish

else if el= F & e2=n/a & e3=F then reject

else reject

Explicit establish pattern here is (el=T e2=F e3=n/a); explicit reject pattern here is (el=F e2=n/ae3=F) Implicit reject pattern here is ' [el=T e2=Fe2=n/a)ANDNOT(el=Fe2=n/a

e3=F].

Suppose instead of reject in the “else” part, there was establish. Then the implicit establish

pattern would be: ' [el=T e2=F e3=n/a] A ' [el=F e2=h/a e3=F]

5.12. AXIOM# 6

The specificity of pattems is based on number of common evidences

Example:

If pi = {el e2e3} andp2=|el e2e3 e4e5} then pi is more specific than p2 or p2 is more

general than pi 72

If pi = (el e2e3},p2={el e2e3e4e5} andp3={el e2e3 e4} then pi is more specific than p3 or p3 is more general than p i; p3 is more specific than p2 or p2 is more general than p3.

5.13. CONJECTURE# 4

Y is a subclass of X if:

(i) the reject-pattems of X are “more specific” than the reject-pattems of Y.

(ii) The establish pattems of X are “more general” than the establish pattems of Y

5.14. C()NJECTURE#5

If the reject pattems of both X and Y are explicit then Y is a subclass of X if the conditions in conJecture#4 is true

5.15. C()NJECTURE#6

If the reject pattems of X are explicit and the reject pattems of Y are implicit then Y is a subclass of X if in addition to conditions in conjecture#4 the following condition are tme:

(i) None of the reject pattems of Y is an establish pattem of X

(ii) None of the establish pattems of Y is a reject pattem of X

5.16. CONJECTURE#?

If the reject pattems of both X and Y are implicit then Y is a subclass of X if conditions in conjectures## & 6 are valid.

5.17. CONJECTURE#*

If the establish-reject pattems of X and Y are implicit as well as explicit then Y is subclass of

X, if in addition to conditions in conjectures#4 & 6, the following are true:

(i) None of the explicit establish pattems of Y is an explicit reject pattem of X 73

(ii) None of the implicit reject pattems of Y is an explicit establish pattem of X

5.18. AXIOM#?

Implicit-reject-pattem = [explicit-establish-pattems] U [explicit-reject-pattems]

5.19. AXIOM#*: DECÎENERATED PREDICATES

(i) If X is structurally connected to Y as per domain theory then stmcturally-connected(x,y) is tme

(ii) If X is causally connected to Y as per domain theory then causally-connected(x,y ) is tme

(iii) If X is manifested due to some behavior in Y then behavior-link(x,y) is tme

(iv) If X is a part of Y as per domain theory then part-of(x,y) is tme

(v) If X is a type of Y as per domain theory then type-of(x,y) is tme

5.20. AX10M#9: DOMAIN THEORY PREDICATES

Using Hom Clauses to represent:

(i) X is a substmcture of Y substmcture-of (x,y) stmcturally-connected(x,y) A part-of(x,y)

(ii) X is a subfunction of Y

subfunction-of (x,y) causally-connected(x,y) A behavior-link(x,y)

(iii) X is a subsystem of Y

subsystem-of(x,y) stmcturally-connected(x,y) A causally-connected (x,y) A

part-of (x,y) 74

Note that “Degenerated Predicates” on the RHS of “Domain theory predicates” in axiom#8 are not restricted to structurally-connected, causally-connected, behavior-link and part-of; there could be more such predicates. However, these predicates cannot be expanded any further. And all of these predicates are assumed to be given. That is why they are called

“degenerated predicates.”

5.21. AXI()M#IO

Hom clause for “X is a subclass of Y” subclass(x,y):-substructure~of(x,y) V subsystem-of(x,y) V subfunction~of(x,y) V

type-of(x,y) V part-of(x,y)

5.22. AXIOM#! I

X is an “intermediate” class if the evidence set of X is derived from its subclasses. In other words, X is not an independent domain concept, but is derived by the intersection of evidence sets of other domain concepts.

5.23. CONJECTURE#9: EVIDENCE PARTITION FOR INTERMEDIATE CLASS

Suppose the set S = {a, b, c ,...} is such that each member in S has an association with X given in the form of domain theory predicates defined in axioms#8,9,10 and 11. X is a superclass of a,b, c...if:

(i) X does not have its own evidence set, i.e. X is an intermediate class. E(x) can be derived as:

E(x) = E(a) riE(b) Q E (c)...

(ii) X has its own evidence set E(x) and

EC = E(x) O E(a) QE(b) Q E(c)... exists. 75

Singularity: If E(x) and EC are null then X cannot be a superclass of members of S. In such a case, if a partition (or subset) Si of S that gives non-null E(x) or EC can be found then X will be a superclass of members of Si.

Note that it is combinatorially explosive to find feasible SI because the total number of partitions of S with n elements is: nCn + nC(n-l) + nC(n-2) +....+nC3 + nC2 + nCI = 2®^.

However, for small n, say between 1 and 5, the total number of partitions is less than 32, which is manageable.

5.24. THE HCl ALGORITHM FOR THE CONSTRUCTION OF

CLASSIFICATION HIERARCHIES

The algorithm for the automatic generation of a classification hierarchy is as following.

5.24.1. Algorithm: HCl

Input: A set of concepts S and their evidence sets^

Output: Classification hierarchy

For each concept C in S do the following: a. Try to determine the superclass of C by applying domain theoretic relationships b. If C has no superclass based on domain theoiy then try to determine a superclass by using

‘class specificity’ and ‘class proximity’ criteria c. If C still has no superclass then assign ‘root’ as the parent of C. d. If Sc is C ’s superclass, P is Sg’s parent, I is the intersection of the evidence-sets of Sc and

C, I^ is the intersection of I and P’s evidence-set, then do one of the following transformations:

1. Evidence set means a positive evidence set as defined in Axiom#3, from now on. The negative evi­ dence set can also be used, without additional changes to the algorithm. 76

(i) if P is root, then create an anchor class, as a child of root, for C and Sc, with I as

the evidence-set, using HC-transformation#l shown in Figure 13.

(ii) if? is an anchor class, I is same as P’s evidence-set, then make C as the sibling

of Sc, with P as the parent, by HC-transformation#2 shown in Figure 13.

(iii) if P is an anchor class, and P’s evidence-set is a subset of I, then create an

anchor class as a subclass of P with I as the evidence-set, for C and Sc, using

HC-transformation#3 shown in Figure 13.

(iv) if P is an anchor class, and I is a subset of P’s evidence-set, then create an

anchor class with I as the evidence-set, as a subclass of P’s parent P^, using

HC-transformation#4, shown in Figure 14.

(v) if P is an anchor class, there is no subset relationship between I and P, and 1“ is

NOT null, then create an anchor class with as the evidence-set, as a subclass of

P’s parent P^, using HC-transformation#5, shown m Figure 14.

(vi) if P is an anchor class, there is no subset relationship between I and P, and is

null, then place C as a child of root.

Note that the topmost class of the classification hierarchy is called ‘root,’ and its evidence set could be null, but is usually the intersection of the evidence-sets of all the classes in S, if the intersection exists. It is used to initialize the classification hierarchy datastructure.

To illustrate H C l, the concepts shown in Figure 15 are used as input. The resultant hierarchy generated by HCl is shown in Figure 16. HCl created mome anchor classes: A l, A2, A3 and 77

HC-TVansformation#!

Root Root

anchor (evidence-set=I)

Sc c

HC-TVansformation#2 p2 P2

Sc C

H C-T ransformation#3 p2 p2

anchor (evidence-set=!)

Sc c

Figure 13. HC Transformations#!,2,3 78

HC-Transformation#4

p2 p2

anchor (evidence-set=I)

Sc c P

H C-T ransformation#5

p2 p2

anchor (evidence-set=I^)

c P

Sc

Figure 14. HC-Transforinations#4,5 79

(defun init-evidence-table () (letO . (setf root ’R) ;;declaration of classes (setf classes ’(a b c d e f g h i j k 1 m n o p)) ;;Evidence-Sets of Classes

(setf (get ’a ’evidence-set) ’(xl x4 x5 x6))

(setf (get ’b ’evidence-set) ’(xl x6 x7))

(setf (get ’c ’evidence-set) ’(xl x2 x3 x4))

(setf (get’d ’evidence-set) ’(x4 x8))

(setf (get ’e ’evidence-set) ’(x5 x9))

(setf (get ’f ’evidence-set) ’(x6 xlO))

(setf (get ’g ’evidence-set) ’(x7 xlO xll))

(setf (get ’h ’evidence-set) ’(x2 x3 xl2))

(setf (get ’i ’evidence-set) ’(x4 x7 xl3))

(setf (get ’j ’evidence-set) ’(xlO xl4))

(Setf (get ’k ’evidence-set) ’(xll xl5))

(setf (get ’1 ’evidence-set) ’(xl6 xl7))

(setf (get’m ’evidence-set) ’(xl5 xl))))

(setf (get n ’evidence-set) ’(xl5 xl8 xl9 x20))))

(setf (get o ’evidence-set) ’(xlO x21 x22 x23))))

(setf (get ’p ’evidence-set) ’(xlO x24 x25 x26 x27))))

Figure 15. HCl Illustration#!—Input a3 aS a9

a?

a6

a4

aS

Figure 16. Hierarchy created by HC 1 for input shown in Figure 15 81

A4, A5, A6, A7, A8, and A9. The evidence-sets of these classes are: Al=(xlO), A2=(xl),

A3=(xlO), A4=(x4). A5=(xl x6), A6=(x2 x3), A7=(xl5), A8=(xlO), A9=(xl0).

5.25. HCl ILLUSTRATION FOR A REAL-WORLD PROBLEM

For illustration on a real world problem, a process diagnosis problem is chosen. A

classification hierarchy for diagnosing faults with a Fluid Catalytic Cracking unit in a petroleum refinery was previously developed (see Figure 17) (Ramesh, 1989). The various evidences used for establishing the faults are shown in Appendix B.

5.25.1. Case Study

Figure 18 shows the input for this case study. The input contains a set of classes and their evidence sets (for a complete explanation of evidence sets see Appendix C). The output of

HCl is shown in Figure 19. The complete trace of this case study is also shown in Appendix

C.

The hierarchy created has fewer levels of depth than the original hierarchy. Overall, it is not a disappointing performance by H C l. This outcome is expected, because we have not given to

HC1 all of the domain relationships. On the other hand, the hierarchy created by HC1 provide new grounds to criticize the KE’s hierarchy:

® Some classes in the hierarchy have no common evidences with their parents. Perhaps, they are there in the hierarchy because of semantic or contextual links created by the knowledge engineer.

® Some classes in one branch of the hierarchy have more common evidences with a class in

another branch of the hierarchy than with their direct parents. It would have been computationally efficient if these classes had extra “subclass” links and the establish-refine

algorithm applied on the hierarchy could handle these “subclass” links (some recent

versions of HYPER and MATCHER may have this feature). 82

III c:Ri«Row :Em foq m BQC4.*gaiyiEH , t -'tu :n

HYDRAULIC SYSTEM

RECENCAT.SLIDEVALVE

INSTRUMENTATION RTEUP CONTROL RTEUP SETPOINT

RTEUP CONTROLLER

, REGEN GRID A VENT.OPEN

/^AFLOW.UETEm . AIRFEEO CONTROL INTAKE.PLUGGING AIRFEEO SYS < \ BLOWER AMBIENT.CONDNS^ AIRHEATER DIRT SURFACE.CONDENSER

MOTOR WEAR REAC10R-RECEN BSTEAU.SUPPLY

lOaCHOlL.fLOW TORCHOIL.SVS BLIND LEAK

Figure 17. Classification hierarchy for FCC (KE’s version) 83

••tiiim itM ifdtim ftiinfdftffttfiifitiftifftftitftft ;;Input data for case-study#l ;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! (defun init-evidence-table () (let () (setf root ’x) (setf (get ’x ’evidence-set) '(ell)) (setf classes ’(wear motor surface-condensers aflow-meter vent-open regen-grid rtemp-controller rtemp-setpoint hydraulic-system regen-cat-slide-valve instrumentation dir ambient-conditions intake-plugging torchoil-flow blind-leak)) ;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! "following are evidence sets of classes • •tiitiKifum ittfifftitfdftftfftftffftffftfffffrrm tff (setf (get ’instrumentation ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e40 e41 e42» (setf (get ’hydraulic-system ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4el5 el6 el7 el8)) (setf (get ’regen-cat-slide-valve ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 el9 e20)) (setf (get ’rtemp-setpoint ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e21)) (setf (get ’rtemp-controller ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e22)) (setf (get ’regen-grid ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e27 e28 e29)) (setf (get ’vent-open ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e30)) (setf (get ’aflow-meter ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e31)) (setf (get ’motor ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e33 e34 e35)) (setf (get ’wear ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e34 e33 e38 e39)) (setf (get ’surface-condensers ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e34 e36 e37)) (setf (get ’dirt ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e43)) (setf (get ’ambient-conditions ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e44)) (setf (get ’intake-plugging ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e45)) (setf (get ’torchoil-flow ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e28 e46 e47 e48)) (setf (get ’blind-leak ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e28 e46 e47 e49))))

Figure 18. Input to HCl for the FCC problem INSTRUMENTATION HYDRAULIC-SYSTEM REGEN-CAT-SLIDE-VALVE RTEMP-CONTROLLER RTEMP-SETPOINT WEAR

MOTOR SURFACE-CONDENSERS X INTAKE-PLUGGING

AMBIENT-CONDITIONS DIRT a3 AFLOW-METER

VENT-OPEN

BLIND-LEAK TORCHOIL-FLOW

Figure 19. Hierarchy generated by HCl for Case#l 85

® The semantic or contextual links do not take full advantage of the efficiency offered by establish-refine algorithm whUe searching classification hierarchies. Probably, the KE wanted to group-and-reject classes rather than reject classes because a group of classes lack common evidences to be classified together.

5.26. CONCLUSIONS

The HCl program described in this chapter offers an easy way to handle the complexity in creating classification hierarchies using evidence sets. The HCl program replaces the “art” of generating classification hierarchies with a set of clear guidelines based on the axioms and conjectures presented in this chapter. In comparison to deep-model approach, where a classification hierarchy is compiled from a deep model using knowledge specified implicitly within the deep model, HCl has a far simpler representation and consequently, has far lesser computational complexity. As illustrated for a diagnosis problem, HCl can assist a knowledge engineer during the conceptualization of a classification hierarchy, especially when domain theory to decide on the best classification hierarchy is lacking or insufficient.

The practical utility of HCl lies for problems like diagnosis and selection that involve hierarchical search space. Also, HCl can function as a useful tool during knowledge acquisition for the development of classification hierarchies, automatically. The various criteria in HCl speed the knowledge acquisition process and make the necessary domain theory for creating classification hierarchies explicit without the help of a deep model. CHAPTER VI

CASE-BASED REASONING APPROACH TO SYNTHESIS OF

PROCESS FLOWSHEETS

6.1. INTRODUCTION

In this Chapter a case-based reasoning (CBR) approach to process synthesis is described.

CBR makes use of real design cases that are usually available in the synthesis domain.

Case-based reasoning has also been used for prediction of protein structures (Zhang, et al.,

1989), diagnosis (Redmont, 1989) and design (Navinchandra, 1988: Goel, 1989). The choice of CBR for process synthesis is motivated because of several reasons.

® New processes are frequently developed by modifying existing processes. Upgrading process plants to take advantage of latest technology also involves transfonnations on an existing process. These transformations can be substitution of an old process with a new process, addition of a new process or deletion of an old process. An example of case-based addition is realizing improvement in the purity of a distillation column overheads by adding a membrane separation unit in the reflux.

® Seldom are new designs developed from scratch. For instance, if a hexagonal cross-section pipe has to be designed, then the well known design procedure of a circular cross-section pipe is adapted.

8fi 87

® Decisions made while doing a current design are often justified by the successes and failures in the past designs. For instance, a leak detector in an evaporator concentrating a hazardous solution is justified based on an accident a month ago at a hazardous liquid storage tanlc unit without a leak detector.

Reasoning by analogy is one of the ways to produce creative designs. A case from a different domain context may be employed for solving a design problem. For example, a case of laser application in a biology experiment to sterilize a sample without damaging the biological cells may be adapted for the design of feed preheaters in a process that manufactures speciality chemicals, to avoid the fouling problems associated with conventional heat exchangers.

6.2. CASE-BASED REASONING VS. TASK-BASED APPROACH

It is illustrative to compare case-based reasoning with the task-based approach for process synthesis. The major difference is that the knowledge base of a case-based reasoner contains past design solutions called cases. Cases contain features or indexes. For process synthesis problem solving a case is a flowsheet composed of process steps/unit operations and their inter-connections, raw materials, products, substance compositions, and process conditions.

The task-based methods are abstractions drawn from problem solving occurring in similar problems across different domains, and thus can be regarded as empirical or heuristic. For example, hierarchical classification and stmctured pattern matching are abstracted from medical and engineering diagnosis problem solving. In comparison, CBR does not distinguish among particular tasks or problems being solved; it relies on the domain-specific experiential knowledge represented as cases.

The output from the case-based problem solving can be fed back into the case knowledge base. That is, the output can be treated like just another case. In this sense, the system can 8 8 leam. Whereas, learning is more complicated in the task-based system because the knowledge is represented at a higher level of abstraction than the data (system input) or the results (system output).

A case is a trace of a problem solving experience. For a CBR to apply this experience on a similar problem, efficient methods for matching, retrieval and modification are required. It is here the the generic task-based methods can be applied. The rest of this chapter makes clear the integration of task-based methods with CBR and demonstrates the integrated approach for process synthesis problem solving.

6.3. CONTROL STRUCTURE OF CASE-BASED REASONER FOR PROCESS

SYNTHESIS (CBR-PROCSYN)

The control structure describing the input-output behavior of a case-based reasoner for process synthesis problem solving, called CBR-ProcSyn, is described here. We assume that the case-based reasoner is solving the following process synthesis problem;

Given a set of specifications about raw materials and products as constraints, synthesize a process flowsheet using a knowledge-base of process cases, and the case-based reasoning strategies.

In the process synthesis problem solving the constraints are: selection-, synthesis-, interaction- or refinement-type. Additional constraints may be about the substances, compositions, and process conditions, as presented in Chapters 2 & 3. Following are examples of these constraints:

6.3.1. Constraints on .selection of processes

The process should use distillation for purifying product

The chemical reaction has to be carried out in a fluidized bed reactor 89

6.3.2. Constraints on Synthesis of Processes

All the heat exchangers should be integrated for heat energy efficiency.

The fermentation processes should operate in a fed-batch mode whereas the product purification processes should be continuous

6.3.3. Constraints on Input/Output Substances and Compositions.

The input to the ammonia process should be 40% nitrogen and 60% hydrogen feed mixture

The output of the process should be 99% pure ammonia product

6.3.4. Constraints on Process Conditions.

The ammonia process should operate under lowest possible pressure conditions

The temperature of ammonia process should not exceed 1000 degree K

The CBR-ProcSyn input-output specification is as follows.

6.3.5. Algorithm: CBR-ProcSyn

Input: State of the raw materials

Output: A feasible flowsheet

The control stmcture of CBR-ProcSyn is:

RETRIEVAL:

1. Find a set of process cases P in the knowledge base that operate on the initial state or the raw materials

[The following step is executed recursively]

2. For each process Q in P do the following: 90

SELECTION:

2. (a) Apply process synthesis constraints

2. (b) If Q satisfies the constraints then go to step 2(d)

MODIFICATION:

2 (c) If Q doesn’t satisfy the constraints, then modify Q using the CBR modification strategies (described in a subsequent section). After modification, if the process synthesis constraints are still not satisfied, then reject Q and return to step 2; else continue.

SYNTHESIS:

2. (d) Test Q ’s outputs for final product specification (constraint). If Q’s outputs match the final product specification then add Q to the partial flowsheet structure F. Print F as a feasible flowsheet; otherwise continue.

SELECTION:

2. (e) Find a set of process cases R in the knowledge base that operate on the outputs of Q. If there is no process case (i.e., R is a null set) then reject Q and return to 2.

SYNTHESIS:

2. (f) If R is not a null set then add Q to the partial flowsheet structure F. Go to step 2 by setting

P = R

(Thus step 2 is executed recursively until all applicable processes have been tried).

The interpretation is as follows:

1 and 2 (e) are ‘retrieval’ steps as they involve search in the knowledge base for a feasible process; 2(a) is a ‘selection’ step because the retrieved case is selected by application of process synthesis constraints; 2(c) is a ‘modification’ step as it tries to modify the selected 91 process; finally, 2(d) and 2(f) involve creating a partial flowsheet structure, so they are

‘synthesis’ steps.

In sum, the four major steps in the control stmcture of a case-based reasoner for process synthesis problem solving are; retrieval, selection, modification and synthesis. We now describe each of these steps in more detail.

6.4. RETRIEVAL AND SELECTION OF PROCESS CASES

The problem of case-based retrieval is stated as:

Find a set of process cases in thé knowledge-base satisfying input/output and functional specifications of the process synthesis problem.

The problem of case-based selection is stated as:

Given a set of retrieved cases, rank them to determine the ‘best case.’

Methods for retrieval and selection of process cases

It is computationally efficient to represent the various process synthesis cases as a classification hierarchy for retrieval and selection steps (Goel, 1989). The processes to make ammonia are arranged in a classification hierarchy (see Figure 20) as: Mont-Cenis process,

Claude process and Haber-Bosch process are children of ‘ammonia processes;’ and the

‘ammonia processes’ is a child of ‘fertilizer processes.’

The establish-refine procedure (described in chapter 2) is then applied on the classification hierarchy of process cases. Tlie goal of the establish-refine procedure is to test the various constraints embedded in the classification hierarchy and assign a score to each case.

The retrieval is done by matching the features in the input case with that of the old case; in other words, by establishing and refining the classification hierarchy. The following types of cases may be retrieved after classification: 92

Haber-Bosch Process

Fertilizer Processes Ammonia Process Mont-Cenis Process

Claude Process

Figure 20: Classification of cases 93

@ An Under-Constrained Case: the old case has additional features the input case does not have,

® An Over-Constrained Case: the input has additional features the old case does not have.

The under-constrained and over-constrained cases may contain features whose values cause the constraints to fail. So in general we have the following types of retrieved cases:

® Type 1 : Under-constrained cases without constraint failure

® Type 2: Under-constrained cases with constraint failure

@ Type 3: Over-constrained cases without constraint failure

® Type 4: Over-constrained cases with constraint failure

After retrieval of process cases, the next step is to select the cases by ranking. If the retrieval step returns ‘exactly’ matched cases, then the selection step is straightforward because any exactly matched case, by definition, is best suited for modification. When no exactly matched cases are retrieved, the cases are ranked by assigning scores to retrieved cases. The scoring procedure is shown in Figure 21. The case receiving the highest score is selected and used in the following modification step.

6,5. MODIFICATION OF PROCESS CASES

The modification problem can be stated as:

Given a process case P not meeting the input constraints of the process synthesis problem, such as the four types of retrieved cases, modify P’s features by: addition, deletion and modification, to satisfy the process synthesis constraints.

In the following sections the addition, deletion and modification problems are defined.

Because modification requires changing various process features like inputs, outputs. 9 4 Thice different scenariœ can prevent exact matches during retrieval: 1. Features present in tire old case are absent in the input case 2. Features present in the input case are absent in the old case 3. Valires o f features in input case and old case do not agree These three scenarios are also shown in the following table:

Old Case Input Case

Feature Value. Feature Value

I: Present vl Absent —

H: Absent Present v2

ni: Present v l . Present v2

F-Flag and V-FLAG are ON if only "exact match" cases are to be retrieved. For the retrieval of "partial nuBch" cases F-Flag and V-FIag are OFF I & H: If F-Flag is ON then reject old case If F-Flag is OFF then begin if 2 If V -Âag is OFF then begin if 3 If v l or v2 are not known OR v l is different from v2 Then either reject old case or give partial match score to the feature end if 3 If V-Flag is ON then begin if 4 If v l= v2 then accept old case Else reject old case end if 4 end if 2

III: If V-Flag is ON then begin if 5 If vl= v2 then accept old case Else reject old case end if 5 If V-Flag is OFF then same as in I & II above

■CalculatloniinhtfiralliJSmceitjùiJteiahallvJMatched^^^

Sum of the scores of features Score of a partially matched old case ■ Sum of the maximum scores of features

Figure 21: Scoring procedure for selecting cases 95 substances, process conditions, etc. to meet the process synthesis constraints, it is a constraint-satisfaction problem.

6.5.1. Addition of Process Features

Add, if possible, features like inputs, outputs, substances, process conditions, or even another process to retrieved type-1 or type-2 case (under-constrained cases with/without constraint failure) so that constraints on these features can be satisfied.

As an example addition of process features suppose; a process synthesis problem requires

99% pure hydrogen from a mixture of hydrogen, methane, nitrogen and carbon-dioxide; and a membrane separation process case that can make 99% pure hydrogen product from a mixture of hydrogen, carbon-dioxide and nitrogen is retrieved. The modification step is required to separate hydrogen from the extra methane component that is not present in the retrieved case. If it is possible to separate hydrogen at desired concentration using the membrane separation process alone, the required modification is then simply adding the extra methane component to the inputs of membrane separation; otherwise, by adding another separation process at the output of the membrane separation, say a pressure adsorption process to separate hydrogen from methane, the 99% pure hydrogen product constraint is met.

6.5.2. Deletion of Process Features

Delete, if possible, features like inputs, outputs, substances and process conditions, to retrieved Type-3 or Type-4 process case (over-constrained case with/without constraint failure) so that constraints on these features can be satisfied.

As an example for deletion of process features, consider a retrieved CSTR process case for an exothermic reaction. If the reaction in the desired process has no heat effects, then the cooling system of the CSTR is unnecessary. Therefore, the cooling system can be deleted from the retrieved CSTR process case. 96

6.5.3. Modification of Process Features

Modify, if possible, existing features like inputs, outputs, substances and process conditions, of any retrieved process cases so that constraints on these features are satisfied.

As an example of modifying a process condition suppose: the operating temperature of a coal gasification process retrieved by the CBR, for a certain type of coal, is 1700 degree C; and the coal to be gasified in the desired process has a lower temperature of ignition. Then the operating temperature of the retrieved coal gasification process is modified to suit the coal to be gasified in the desired process.

6.6. SYNTHESIS OF CASES

A process P, after retrieval, selection and modification, has to be integrated into a flowsheet subject to synthesis constraints on the flowsheet like interactions between P and the rest of the processes in the flowsheet and continuous processing of substances.

As an example of a synthesis constraint in bioprocesses, suppose a fermentor product has to be held in a tank for product recovery at a certain temperature. If the fermentation reaction is exothermic (endothermie), then the fermentor output is hot (cold). So, while connecting the fermentor output with the storage tank, the output is cooled (heated) in a heat exchanger.

6.7. EXTENSIONS TO CASE-BASED REASONER FOR PROCESS SYNTHESIS

Critiquing and learning are extensions to the basic CBR-ProcSyn algorithm described.

During critiquing, critics are employed to signal potential constraint failures in the flowsheets. This way, critiquing serves the useful function of evaluating the flowsheets proposed by the CBR.

Critiquing is also used in the modification step, where after a critic is triggered by a constraint failure a particular modification procedure is applied. If critiquing is done after modification. 97 then potential constraint failures in the flowsheet are identified. A potential constraint failure, in this context, may be because a process will not work in the real world.

Constraint failure handling works best interactively while a process is being really implemented. The failure handling procedure here, is more of an anticipatory type than an interactive type. In anticipatory failure handling, to the extent the designer already knows about a failure condition, the design solution is appropriately modified.

The objectives of case-based learning are:

® To augment the case knowledge base with freshly created design solutions.

® To index redesign or design modification knowledge to recover from failures based on the failure handling experience.

® To generate a “Design History” which is a record of all design commitments made during a design episode and their justifications, for explaining the design problem solution.

The learning component of the CBR-ProcSyn serves these objectives by feeding back problem-solving experience to the case knowledge-base under the supervision of the user.

6.S. CASE-BASED REASONER FOR AMMONIA PROCESS SYNTHESIS

Morari & Grossman (ref) have described a preliminary design of an ammonia process that involves the selection, synthesis, design and simulation of various unit operations to synthesize ammonia. WhUe Morari & Grossman have carried out the design and simulation using ASPEN, a computer-aided design (CAD) package; they only presented a qualitative discussion about the selection and synthesis problems. CBR-ProcSyn approach, however, can solve such problems which are not amenable for the conventional CAD packages. The

CBR-ProcSyn is thus useful as a front-end to a conventional CAD package.

The two case studies in the next section illustrate the application of CBR-ProcSyn to: 98

(i) synthesize an ammonia process starting from nitrogen and hydrogen as starting raw material

In this case-study CBR-ProcSyn creates a flowsheet by selecting and synthesizing the required processes to make ammonia.

(ii) synthesize a process to make the starting raw materials in case-study# 1, nitrogen and hydrogen, from basic raw materials.

In this case-study, besides meeting the basic ammonia process specifications starting from a raw material consisting of a nitrogen and hydrogen mixture, the CBR will also consider additional constraints for the generation of ammonia process raw material.

While case-study# 1 demonstrates the case-based reasoning for process synthesis, case-study#2 shows the flexibility of CBR-ProcSyn in specifying the initial state (defined by the starting raw material). Alternatively, it is possible to specify various goal states that involve products of ammonia. This flexibility is because of the systematic matching and retrieval capabilities of CBR-ProcSyn.

6.8.1. Ammonia Process Synthesis Case Study#!

This case-study illustrates how the CBR-ProcSyn can synthesize a new process by retrofitting an existing process. The objectives of the case-study are to first synthesize an

ammonia process; next, retrofit the ammonia process with a hydrogen recovery process; and,

finally, integrate the two processes—that is, the recovered hydrogen is recycled by mixing with feed to the ammonia convertor. Before the case-study is described, the trade-offs in

synthesizing a new ammonia process are made clear.

Ammonia is made from Hydrogen and Nitrogen. The kinetics of ammonia reaction are

extremely sensitive to pressure. At high pressure (-100,000 kPa) the ammonia-conversion

is 65%; ahigherenergycost is incurred for compressing the feed; and safety is at risk. At low

to medium pressures (10,000-35,000 kPa), the compression cost is lower, the risk of 99 operation is minimized, though the single-pass ammonia-conversion is only 35%. So a product recycle and purge around the ammonia convertor are required to increase overall ammonia-conversion in at low pressure. The recycle and purge streams are made of hydrogen, nitrogen, ammonia, and traces of methane, sulfur-dioxide and water that come with the feed streams. Since hydrogen is the most expensive component in the purge stream, it pays to recover the hydrogen in the purge stream for recycle.

CBR-ProcSyn begins solving this problem by searching for a process that can produce ammonia from nitrogen and hydrogen feedstock with hydrogen recovery. The search results in the retrieval of the following under-constrained process cases with constraint failure because none of these processes has hydrogen recovery (see Appendix D for the case knowledge-base).

(a) Claude process (CP)

(b) Mont-Cenis process (MCP)

(c) Haber-Bosch process (HBP)

During selection, the constraint applied is: low pressure for operating the ammonia convertor; to result in the selection of MCP which has the lowest convertor pressure ( 10,000 kPa) among the three retrieved process cases.

The next step is to modify the MCP process for recovering hydrogen from rest of the purge stream: nitrogen, methane, ammonia, sulfur-dioxide and water. A search in the case knowledge base fails to retrieve any process. In reality, this open-ended multicomponent separation problem is very complex requiring combinatorial optimization methods (see

Appendix C). A new method has been developed using Genetic Algorithm (GA) to solve this problem. The GA method is described in Appendix C. The GA method, however, requires inputs of the form: whether component X can be separated from component Y in a multicomponent mixture [A, B,...X, Y...]. 100

So CBR-ProcSyn searches for separation processes where individual components in the purge stream are handled, resulting in the retrieval of several processes:

(a) An absorption process where sulfur-dioxide is separated from air

(b) An absorption process where ammonia is separated from air

(c) A membrane separation process where methane is separated from aromatic compounds

These processes are modified by making a substitution on their feeds to provide the inputs to

GA as:

(i) Sulfur-dioxide and ammonia can be separated from the purge by absorption

(ii) Hydrogen can be separated from nitrogen, methane and water (but not ammonia and sulfur-dioxide) by membrane separation

The GA using these inputs, then, provides several feasible alternatives for recovering hydrogen from the purge.

Thus, here GA is used by CBR as a modification method. One important note though: the feasible alternatives may not really work because the CBR-ProcSyn simply cannot determine using the cases in the knowledge base; and it is not known if a membrane to separate hydrogen from the rest of the inputs is reaUy available. The uncertainty here is coming from the inputs CBR supplied to the GA procedure; whereas, the GA method is exact. To resolve the uncertainty in this case, the inputs to GA should be validated by other means like heuristics.

After synthesizing the hydrogen recovery process, CBR-ProcSyn proceeds with the modification of MCP for retrofitting the ammonia process with hydrogen recovery. The purge streams of MCP are connected to the input of hydrogen recovery process; and the output streams from the hydrogen recovery process are combined with recycle streams to the 101 ammonia convertor. The two finally retrofitted ammonia processes with hydrogen recovery are shown in Figure 22.

6.8.2. Ammonia Process Synthesis Case Study#2

In this case-study, the CBR-ProcSyn is illustrated for the synthesis of a new process to make ammonia starting from coal. Note that none of the previously retrieved ammonia processes can make ammonia from coal. So the CBR-ProcSyn is required to synthesize the necessary feedstocks for the ammonia process synthesized in case-study# 1; thus this is a more complicated example of retrofitting.

CBR-ProcSyn begins problem solving by searching for processes that can operate on coal; and succeeds by retrieving the following process :

(a) Lurgi Coal Gasification Process (LCGP)

(b) Lurgi Pressure Gasification of Coal (LPGC)

(c) Texaco Coal Gasification process (TCGP)

During selection, the constraint applied is: the yield of hydrogen; because hydrogen is one of the feeds for ammonia process. And the processes are ranked as: (i) TCGP (ii) LPGC (iii)

LCGP. So TCGP is selected.

In the modification step the temperature of TCGP is set at 1300 degree C and the inputs and outputs of TCGP are added as:

Input: Output:

Among the outputs of TCGP, only nitrogen and hydrogen are the required feeds for ammonia and the rest of the gases in the mixture have to be removed. So the CBR-ProcSyn searches for a process to remove carbon-dioxide first. The result of the search is the retrieval of an exactly matched process case, Girbatol. with the following inputs and outputs: 102

Hydrogen + Nitrogen Feed

CW REF

Convertor

Ammonia Product

Purge

Hydrogen

Sulfur-dioxide Membrane Banks____

Dryer

Water

Methane

Figure 22: Ammonia process with hydrogen recovery 103

Inputs: 1.

2.

Outputs:!.

2.

The inputs and outputs of the Girbatol process are modified as:

Inputs: 1.

2.

Outputs: 1.

2.

And Girbatol process is integrated with TCGP.

Among the outputs of Girbatol, still g-carbon-monoxide and g-hydrogen-sulfide have to be removed. So a search is conducted to remove either one or both of these gases. The search results in the retrieval of Rectisol process whose inputs and outputs are:

Inputs: 1.

2.

Outputs: 1.

2.

The inputs and outputs of Rectisol are modified as:

Inputs: 1.

2.

Outputs:!.

2. 104

And Rectisol process is integrated with TCGP and Girbatol.

Among the outputs of Rectisol, g-carbon-monoxide has to be removed, so the

CBR-ProcSyn conducts a search in the knowledge-base and retrieves a methanation process whose input and output are as following:

Input:

Output:

The process is selected and modified as following:

Input:

Output:

Since nitrogen and hydrogen make up the required goal state the CBR-ProcSyn integrates all the processes selected so far with the MCP to generate the final ammonia synthesis process as in Figure 23. 105

Methanol Coal ^ TCGP Girbatol MethanationRectisol

Carbon-djioxide Hydroger|-sulfide

Hydrogen + Nirogen + Methane

CW REF

Convertor

Ammonia Product

Purge

Hydrogen

Sulfur-dioxide Membrane Banks____ Absorbt r

Dryer

Water Methane Nitrogen

Figure 23: Ammonia process retrofitted with coal gasification and hydrogen recovery CHAPTER VII

AN INTEGRATED FRAMEWORK FOR INTELLIGENT

COMPUTER AIDED SYNTHESIS OF CHEMICAL AND

BIOCHEMICAL PROCESSES

7.1. INTRODUCTION

111 the preceding chapters several methods for process synthesis have been presented in a stand-alone style. Case-based reasoning is treated independent of task-analysis, though they both use constraints at various levels from design specification to final flowsheet synthesis. Such a treatment has been deliberately adopted to explicate the intrinsic merits and demerits of each method. For instance, case-based reasoning is a relatively weak method, because of its extreme reliance on cases, without access to important heuristic knowledge, unlike the task approach. On the other hand, the task approach uses a ‘static’ collection of rules, devoid of learning capabilities—such as inductive or case-based learning; thus circumscribes itself with a limited set of generative strategies for synthesis problem solving.

Between the two extremes of: empirical CBR approach and heuristic task approach, it is desirable to establish a middle ground where the empirical and heuristic methods are integrated for the highest problem-solving efficiency. The basic premise behind the integration is to promote acquisition of the available domain-specific synthesis knowledge, whether cases or constraints, in an integrated knowledge-based environment for process synthesis. The objective is to create a flexible framework, with diverse knowledge

1 0 6 107 representations and problem-solving methods such as described in the earlier chapters of this dissertation, to simplify the knowledge acquisition process.

Quantitative knowledge, such as mathematical equations and models, comprises a significant portion of the synthesis knowledge too. The integrated framework should have place for quantitative techniques along with qualitative methods. In this dissertation, however, the integration of the qualitative and quantitative techniques is not implemented because of several reasons:

(i) there is a greater need to represent process synthesis knowledge comprehensible by the designers who are already familiar with quantitative techniques in their area,

(ii) it should be easy to integrate quantitative techniques into the knowledge-based/expert system software, using “foreign-function” interfaces and “pipes” already built into programming languages [ex: Common Lisp (Steele, 1986), CLOS (Keene, 1989)] and operating systems (ex: Unix) respectively, and

(iii) it is more advantageous to provide “hooks” in the knowledge-based environment than offer a smorgasbord of quantitative techniques that constantly evolve in parallel to AI methods; so this level integration can happen as-and-when-required, for instance, when a designer decides to generate quantitative data by simulation to verify a result from qualitative reasoning.

7.2. THE INTECRATED PROCESS SYNTHESIS FRAMEWORK

This chapter describes an integrated framework for a Icnowledge-based system for process synthesis problem solving. It outlines the process synthesis task-decomposition and the necessary methods to solve the process synthesis tasks. The task-decomposition is an 1 0 8 improvement over the previous two proposals by Gandikota & Davis (1989) and Narayanan, et al. (1990) by incorporating the following methods:

® Generic tasks,

® Constraint-based reasoning

® Heuristic reasoning

® Qualitative reasoning, and

® Case-based reasoning.

These methods are described in this chapter within the scope of the synthesis framework, and their specific roles in the synthesis problem-solving are made clear.

Another focus while describing these methods is formalization of domain knowledge. Most of the design knowledge in chemical engineering can be formalized as constraints, cases or plans suitable for a knowledge-based reasoning. And the task analysis will give us sufficient power to describe the input-output model; but it is still a skeleton. By filling the task skeleton with a rich variety of methods available from the AI research, we can create a flexible knowledge-based system.

Described at the end of this chapter is a control structure to handle the various synthesis tasks.

The control structure turns out to be very complex and so can be described as a non-linear planning process. It is interesting from concuirent engineering points of view such as: (i) how to decompose a complex design problem into manageable sub-problems (ii) how to distribute the design problem solving among a community of designers (belonging to an organization or group), and (iii) how to combine the partial solutions of subproblems. So the framework indirectly supports concurrent engineering.

Finally, for innovative design problems, mere application of plans and constraints, such as in the simpler DSPL (Brown and Chandrasekaran, 1989) model of routine design, cannot 109 provide satisfactory solutions. Thus the framework also supports innovative design problem solving that requires a non-linear subproblem decomposition and solution composition.

7J. TASK DECOMPOSITION OF PROCESS SYNTHESIS

Process synthesis, fundamentally, is a heuristic search problem where the search space is all possible process interconnections or flowsheets. The heuristic search should find a particular flowsheet called a ’feasible flowsheet’ for a given product. A feasible flowsheet contains a subset of available unit operations for product synthesis, and satisfies all the design problem constraints. The feasible flowsheet also encompasses a particular stmctural sequence of unit operations. By altering the sequence or specifying different unit operations in a feasible flowsheet, while still conforming with the design problem constraints, another feasible flowsheet can be generated.

Mathematically, when there are p processes and u unit operations to carry out each process, p" is the number of possible flowsheets corresponding to a particular way the processes are structurally connected. If there are s structural interconnections, then, sp “ is the total number of possible flowsheets; and the number of feasible flowsheets satisfying the design problem constraints is less than or equal to 5 p “ .

Because of the combinatorial complexity in synthesizing flowsheets, heuristic search is often employed to make the synthesis problem tractable. A task analysis of synthesis or preliminary process design (PPD; note that process synthesis and preliminary process design are used equivalently) decomposes the PPD into several subtasks shown in Figure 24; therefore, the task analysis formalizes the complex heuristic search problem into tractable subproblems.

The rest of the chapter is organized as follows. First, the various process synthesis tasks are presented in detail under the heading of task analysis. Next, the required methods for executing the various tasks are described. The chapter concludes with the control structure of 110 a knowledge-based system for PPD and a brief note on implementation.

Process Synthesis

Skeletal Design Design Refinement

Identificatioi i Selection Synthesis Simulated Functional & Flowsheet Discovery Structural Evaluation (SDR) Refinement (FSE) (FSR)

Figure 24. Process synthesis task analysis

7.4. ANALYSIS OF PROCESS SYNTHESIS TASKS

At the highest level the PPD is decomposed into the following two tasks.

7.4.1. Skeletal Design

The skeletal design task can be stated as:

To make a desired product from a raw material, using the design problem constraints

wherever applicable:

- identify the required processes,

- synthesize a flowsheet by connecting the processes, and Ill

- select the necessary unit operations to carry out the processes in the flowsheet,

Each of the above objectives is a subproblem requiring heuristic search. The necessary tasks

for solving these subproblems are:

® Identification

® Synthesis and

® Selection.

The Identification and Synthesis tasks solve the corresponding constraint satisfaction

problems. Figure 4 shows the the constraints on protein synthesis in biotechnology.

The selection of unit operations involves two subtasks: hierarchical classification and

critiquing (described in Chapter 2). Shown in Figure 6, the hierarchical classification task

decomposes the search space for selection of bioprocesses. The tip nodes in the

classification tree represent the various unit operations that can be selected for each

bioprocess. Since, only one unit operation is required for a process in the flowsheet, the

selected unit operations (if there are many) are ranked using their interaction constraints by

the critiquing task.

The result of skeletal design problem solving is the generation of a set of flowsheets F for

making a desired product and for each flowsheet in F the specification of:

(i) the set of processes P

(ii) the stmctural connections S among the processes in P, and

(iii) the set of necessary unit operations U to carry out each process in P.

The number of flowsheets in F is given by [sum u, i from o to p]* s, where s is the number of

unique stmctural connection specifications in S; p is the number of processes in P; and u is

the number of unit operations available to carry out a process in P. 112

7,4.2. Design Refinement:

The second problem in PPD involves improving the flowsheet generated by the skeletal design task. This is the design refinement task. The subtasks of design refinement are as follows.

7.4.2.1. Simulated Discovery

During simulated discovery, the set of flowsheets F generated by the skeletal design are critiqued and analyzed using constraints. Should a constraint fail, the result is posted as a discovery made for improving the flowsheet.

The critiquing is done by agents called critics. A critic is activated (similar to demons in object-oriented programming) when preconditions for the activation become tme. For example, a wam-pathogenic-bacteria critic is activated when its precondition—pathogenic bacteria in a process stream—becomes true. The action of various critics activated during critiquing is to identify constraint failure in a preliminary flowsheet proposal.

It is not possible to specify, a priori, all possible constraint failure situations as compiled knowledge within critics for the mere reason that one needs to account for every process, unit operation and stmctural specification; which is a combinatorial problem because there are s

possible flowsheets. Also, there are hidden features in the flowsheet that can only be

unravelled by simulation.

The qualitative simulation (QLS)'simulates the behaviors of unit operations expressed as

qualitative symbolic Icnowledge. The goal of QLS is to predict unexpected functional states

of a flowsheet that make design problem constraints to fail. For example, to find which

streams in a flowsheet are contaminated by pathogenic bacteria, the QLS has to propagate

the effects along the upstream process streams to discover the sources of bacteria. The result

of QLS is the discovery of any unexpected functional states whose behavioral patterns

caimot be explicitly represented in the knowledge base. 113

The goal of quantitative simulation (QTS) is to quantify the parameters of interest in the flowsheet using quantitative or numerical process models. For example, the concentration of the product in the fermentor output stream can be computed by solving the mass-balance, energy-balance and kinetic equations of the fermentor. After the QTS finds the numerical value of concentration, the QLS may discover that the product concentration is low or high

(if not normal), signalling a constraint failure. Thus the final results of QTS and QLS are same as critiquing.

7.4.2.2. Functional and Structural Refinement (FSR)

After the simulated discovery task, if any constraints have been found to be violated, the next step is to modify the flowsheet appropriately. This modification procedures are carried out by the functional and structural refinement task (FSR). In the example of wam-pathogenic-bacteria critic, a safety constraint was not met by the proposed flowsheet.

Safety is also a functional requirement of a feasible flowsheet.

Here, function means a state or a series of states (behavior) that we want the processes to display or avoid. Production of desired product and toxic by-product (because of side reactions in metabolic path ways) by fermentation are examples of functions we desire and avoid, respectively. Further examples are: centrifuge has a separation function on a liquid-solid feed; a thermal sterilization system has the function of exterminating germs considered pathogenic to the fermentation.

While the aforementioned functions are explicit, there are also functions that are implicit

(Chandrasekaran, 1990). For example, the role of a pressure relief valve is explained as a safety function to avoid a rupture of the vessel under high pressure; this safety function might never be stated explicitly as a design specification.

So a functional refinement is required whenever the implicit and explicit functional failures of a flowsheet are detected by the simulated discovery task. This is the task of FSR. 114

The FSR task modifies the flowsheet by addition or deletion of a function, or respecification

of the flowsheet structure. When contamination by pathogenic bacteria occurs, additional

functions, like sterilization or purification, may have to be added to the flowsheet; or

stmctural modification may also be proposed by FSR to meet a functional specification or to

avoid constraint failure. For example, if the fermentor product concentration is too low for

the upstream purification to handle, then the product may be recycled, by adding a recycle

stream from the output to the input of the fermentor. (Note that in this section the terms

functions and constraints are used interchangeably.) All of the above effects can be achieved,

alternatively, by making functional as well as stmctural modifications in the flowsheet using

two fermentors in series with a total recycle, in place of a single fermentor without recycle.

7.4.2.3. Flowsheet Evaluation (FSE)

The task of flowsheet evaluation involves finding the economic potential of the proposed

flowsheet given by the equation (Petrides et al., 1989) :

Economic potential = [ value of products and by products - raw materials cost - annualized

cost of upstream section - annualized cost of downstream section]/[value of products and by

products]

This requires estimating quantities of products and by-products produced, raw material and

utilities consumed, concentrations of various streams, etc., if necessary, by using QTS.

7.5. METHODS FOR PROCESS IDENTIFICATION AND SYNTHESIS TASKS

In this section the process identification and synthesis tasks are described in more detail. The

required methods to accomplish these tasks are described and illustrated. 115

7.5.1. Constraint-Based Reasoning

Constraint-based reasoning (CTBR) in process synthesis implies propagation of constraints

about functional and structural relationships between unit operations, processes or systems

in a flowsheet.

Process synthesis, therefore, is not entirely a numerical constraint-satisfaction problem per

se, amenable for the linear, integer and dynamic programming techniques only. Most of the process design, involves generating and testing solutions using qualitative criteria like, good vs. bad, safe vs. unsafe, feasible vs. unfeasible, polluting vs. non-polluting etc. Constraints

are satisfied when the trade-offs allow; otherwise, either the constraints are relaxed, the particular design is modified or, if nothing works, the design may even be abandoned.

A functional relationship between a reactor system and a cooling system are specified as a

constraint in the following example:

“If the reaction in the reaction system is highly exothermic then a constant product concentration (a productivity constraint) and a constant temperature (a safety constraint)

have to be maintained by a cooling system.”

PPD also involves satisfying constraints that originate from the stmctural specifications as:

“The capacity of the upstream purification process is less than or equal to the volumetric

productivity of the downstream fermentor.”

There are further constraints from the design problem specification such as:

“The economic potential of the plant should be greater than...”

“The purity of the final product should be above...”

The feasible flowsheet should also satisfy constraints because of regulatory laws,

environmental pollution, by-product treatment, material and energy conservation, waste

recycling, etc. 1 1 6

The process identification task in bioprocess synthesis executes a plan version of constraint-satisfaction problem, where the functional requirements of a bioprocess flowsheet have been enumerated a priori (see Figure 4). The synthesis task also uses the same plan to specify the connections among the processes and propose a preliminary flowsheet. The problem solving here is apparently simple involving instantiation of the plan shown in Figure 4. That is because, most of the computational work was done by the expert who proposed the algorithm after studying several bioprocess flowsheets. The implication of the plan in Figure 4 is: satisfying functional and structural constraints on the processes is a necessary requirement for generating a preliminary process flowsheet.

The constraint satisfaction problem for the selection of unit operations is stated as:

Given a set of unit operations U for a process P, detennine a subset of U that best meets a set of selection constraints S as well as a set of relational constraints R on the unit operations in U.

The unit-operation selection constraints are the selection criteria on the unit operations

(Petrides, et al.) such as:

In airlift fermentors, air (an inert gas) circulates reactor contents through an external tube or internal draft tube. These reactors have no rotating parts, are of simple constmction and operation, are low power consumers, and have good mass transfer characteristics.

Enzyme reactors are fixed beds, stirred tanks or fluidized beds; they are operated in batch or continuous mode.

Well mixed reactors are advantageous to avoid high substrate concentration and substrate inhibition.

Fixed bed or plug flow reactors are suitable for product inhibited reactions.

The unit-operation selection constraints on various unit operations available for bioprocesses are represented inside the classes in the hierarchy shown in Figure 6. The 117 method used to test the selection constraints on the unit operations in the hierarchy is called establish-refine. Constraints are represented as if-then rule patterns in each node of the hierarchy. A confidence value is assigned to a node after testing the rules. If the confidence value of a node is higher than a threshold value then the unit operation is selected (refer to

Chapter 2 for details).

The final result of establish-refine method can be more than one available unit operation for a process. It would help the process synthesis to rank the selected unit operations. Relational constraints among the unit operations are used for this purpose. Some examples of relational constraints among the unit operations are:

If the size of cells is very small then membrane filtration is preferable to centrifugation for cell harvesting

If product degrades under heat then use membrane sterilization instead of thermal sterilization

For a medium scale fermentation mechanical agitation is preferable to air lift fermentation all other conditions being same

The object-consistency networks (OCN’s) method transforms the relational constraints into: (i) negation, (ii) elimination, (iii) addition and (iv) reinforcement constraints. The algorithms for processing these four types of relational constraints called 2-object and

3—object consistency checks have been described in greater detail in Chapter 2. 11 8

Identification: Simulated Discovery (SDR):

Critiquing Classification Qualitative Simulation (Q L ^ Heuristics [Quantitative Simulation (QTS) Constraint-Based Reasoning (CTBR)

Selection: Functional & Structural Refinement (FSR):

Classification Heuristics

Heuristics Case-Based Reasoning (CBR)

Constraint-Based Reasoning (CTBR)

Synthesis: Flowsheet Evaluation (FSE):

Constraint-Based Reasoning (CTBR) Numerical Optimization

Figure 25. Methods for process synthesis tasks

7.6. METHODS FOR SIMULATED DISCOVERY TASK

7.6.1. Critiquing

Critiquing is used for the subtask of simulated discovery in the context of design refinement

as:

Given a flowsheet proposal and design problem constraints, identify the structural,

functional and behavioral patterns in the proposed flowsheet causing constraint failures. 119

Critiquing is carried out by invoking a community of critics. Each critic has a precondition for activation and is activated when the precondition is true in a given flowsheet proposal.

For instance, the precondition for a wam-pathogenic-bacteria critic is existence of pathogenic bacteria in a stream composition. To handle the constraint failure, the critic uses local knowledge called argument. The argument in wam-pathogenic-bacteria critic may say: “Pathogenic bacteria are not

allowed in the product. To remove the pathogenic bacteria apply sterilization operation.”

The FSR task uses the argument to add a sterilization operation in the flowsheet. If the critic does not have an argument, then the FSR will determine the appropriate structural or functional modification.

7.6.2. Qualitative Simulation

Qualitative Simulation (QLS) is used by the simulated discovery task for refining a flowsheet proposal. The task of QLS is:

Given a flowsheet proposal, identify unexpected states in the behavior of unit operations leading to constraint or functional failures by causal reasoning.

QLS uses symbolic Icnowledge to express the behavior of components such as unit operations in a flowsheet. For example, consider a flowsheet containing a fermentor, a centrifuge and a distillation column [see Figure 28 (a)]. The output from the fermentor, a liquid-solid stream, is fed to the centrifuge where the liquid component is separated from the solid component. The liquid component from centrifuge is sent to the distillation column to recover the desired product. The solid output from the centrifuge is a waste to be disposed off.

While designing the flowsheet, QLS predicts the effects of changing the fermentation kinetics and operation modes of equipment by analyzing Whatif scenarios such as: “What if 120 the fermentation metabolic path ways produce toxic by product?” and “What if the fermentor operates in batch mode and the product has to be produced continuously?”

To answer the first WTiatif question the QLS applies the following if-then rules that capture the input-output causality of process equipment:

If the fermentation metabolic pathways produce a toxic by-product then the output of the fermentor is toxic.

If the input of any transfer process (ex; centrifuge) contains a toxic chemical then all the outputs of the transfer process are toxic.

The disposal of toxic waste without treatment will result in environmental pollution

After applying the above rules the QLS determines that when the fennentation metabolic pathways produce toxic by-product, the waste causes environmental pollution. This leads to the failure of an environment safety function, which is an implicit design constraint on the flowsheet specifying that no process should pollute the environment. To handle this constraint failure, the FSR first seeks a method to prevent the production of toxic by-product. If that fads, it adds an appropriate purification process at the fermentor output to remove the toxic by-product.

To answer the second Whatif question: “What if the fermentor operates in batch mode and the product has to be produced continuously?,” the QLS applies the following rules:

If the product has to be produced continuously then the downstream purification system

should operate continuously

If the downstream purification system operates continuously and the upstream fermentation

system operates in batch then there is a failure in the continuous operation 121

As a consequence of these rules, the QLS concludes an operational failure. The FSR task handles the failure by suggesting a buffer tank between the downstream and upstream systems of the flowsheet.

Note that quantitative simulation alone cannot answer the two Whatif questions. Also, in the first example of toxic by-product the causal reasoning can be very complex if there are many process equipment between the fermentor and the waste. This is unavoidable. If QLS is replaced with compiled knowledge, the result is loss of generality, such as in the following rule: “If the fermentation metabolic pathways produce toxic by-product then the solid waste from the centrifuge contains toxic product and causes environmental pollution.” In a process with a dryer instead of centrifuge this compiled knowledge becomes invalid; thus, the usefulness of compiled knowledge is very limited,

7.6.3. Quantitative Simulation

Quantitative simulation is necessary in PPD when precise values for the variables in the flowsheet are required. Consider a situation where the initial flowsheet proposal contains a single fermentor. If the fermentation yield is known to be small and the product flow-rate should be high, then, at the time of implementing this flowsheet one may design an unrealistically large capacity fermentor. To confirm that the size of the fermentor is not unrealistically large, QTS uses the following equation for estimating fermentor volume:

Fermentor Volume = product flow rate/fermentor volumetric productivity where the fermentor volumetric productivity is given by kg of product / [( m^of fermentor volume) * min] is calculated from the kinetic equations or pilot plant data, and the product flow rate (kg of product / min) is a design problem specification.

After applying the above equation, if the fermentor volume is determined as very large, then

FSR may suggest any of the following strategies: 122

(a) recycling (ex: recycle the fermentor product)

(b) multiple operations (ex: two smaller fermentors instead of one large one)

(c) multiple operations with recycling (ex: use two fermentors with recycle)

The final decision is made by applying QLS on each of these alternatives and evaluating the trade-offs between their capital and operating costs. At the same time, the alternatives are tested for the satisfaction of design problem constraints.

Thus a designer cannot rely on QLS alone. Whereas, the data generated from QTS are only useful when they can be interpreted qualitatively. Thus QLS uses QTS in PPD to undertake more rigorous testing of design problem constraints.

In summary, at the end of simulated discovery task the flowsheet proposed by the skeletal design task is verified for satisfaction of more constraints, and also the functional requirements of the design problem. Using the method of critiquing, obvious drawbacks In the flowsheet are detected. Unexpected behavioral states in the flowsheet are discovered using QTS and QLS. The discoveries made are then posted as function or constraint failures in the PPD done so far.

7.7. METHODS FOR FSR TASK

The functional and stmctural refinement (FSR) task follows the critiquing and qualitative reasoning of simulated discovery task. The objective of FSR is to handle the constraint failures detected by the simulated discovery task by making appropriate changes in the flowsheet proposal. The changes usually entail modifications in the unit operations

(functions) or their connections (stmcture) of the flowsheet using the following methods.

7.7.1. Heuristic Reasoning

Heuristics are used throughout the PPD for the skeletal design tasks. Reasoning with heuristics is potentially fallible because it is not known what their exceptions are. Further, in 123 a rule-based system, heuristics stated as ‘meta rules’ can complicate the inference engine procedures. By stating the meta-rules as a set of constraints, however, the constraints can be applied directly by the PPD. So it is better to formalize heuristics as constraints, functions, and qualitative reasoning rules for each task. While selecting processes, heuristics available in the domain (Petrides et al., 1989) can be transformed, for example, into constraints as following.

Heuristic: Select processes that make use of the greatest differences in the properties of the product and its contaminant

Equivalent negation constraint: if the densities of yeast cells and intracellular product are nearly same but their sizes are significantly different then membrane separation is selected and centrifugation is rejected.

Relational Constraints between batch and continuous fermentors:

Heuristic: if larger productivity and more efficient substrate utilization than batch fermentor are desired then a continuous fermentor is selected

Equivalent reinforcement constraint: if batch fermentor is rejected because of smaller productivity and less efficient substrate utilization then continuous fermentor is selected

Heuristic: if lower product concentration than batch fermentors resulting in higher recovery cost are tolerable then continuous fennentation is selected

Equivalent reinforcement constraint: if batch fermentation is rejected in favor of lower product concentration and lower recovery cost then continuous fermentation is selected

Heuristic: if multiple products are to be produced in the plant then a batch fermentor is preferable over continuous fennentor

Equivalent negation constraint: if a batch fermentor is selected because multiple products are produced in the plant then continuous fermentor is rejected 124

Similarly for the synthesis task the following heuristics may be used as constraints;

Heuristic: Remove the easiest to remove first

Constraint: if the fermentor product contains yeast cells, intracellular product and extracellular liquid, then remove the yeast cells first

Heuristic: Remove the most plentiful component first

Constraint: if the liquid-solid stream contains intracellular product then separate the extracellular liquid first

Heuristic: Make the most difficult and most expensive separations last

Constraint: if the output is from a protein refolding tank then first select ultrafiltration to remove unfolded protein and then select chromatography to separate pure protein product.

7.7.2. Case-Based Reasoning

Very often, in process synthesis the specifications on unit operations and design are available from other plants making similar product or laboratory pilot-plant studies. So a new plant can be designed based on an existing plant or a pilot process. For example, the process flowsheet for making an intracellular rDNA product can be used as a starting point for making other intracellular protein products.

Portions of a flowsheet or functional subsystems, if not an entire flowsheet, may be imported from working designs in another domain causing cross-fertUization of design ideas. For example, an ethanol fermentation process may contain a novel downstream process (like a new type of fermentor, sterilizer, etc.), whereas the upstream process may use a conventional azeotropic distillation schemes to produce 99% pure ethanol from a dilute ethanol-water mixture is use of the

When data for some design variables are not available, they can be obtained from other designs. So, if data on protein expression level are not available while estimating the 125 fermentor productivity during a PPD, a value may be selected by analogy to similar products in other designs.

The underlying principle in these examples is; reasoning by analogy. Design cases arising from the problem-solving experience of a designer or concurrent engineering such as in a design organization, provide useful analogies for future designs.

Although a successful design case is often emulated, an old design case is useful for the current problem, only if there is sufficient degree of match between the two. So to synthesize a penicillin process, the case of toluene from crude oil refinement is irrelevant; whereas, the case of protein fermentation is relevant.

The algorithm (see Figure 4) used in the process identification subtask of skeletal design is also a synthesis plan abstracted and generalized from several bioprocess flowsheets through reasoning by analogy by an expert.

As in the above examples, if specific design episodes based on past design experience are used for solving a new design problem, the reasoning is called case-based reasoning (CBR).

Shown in Figure 26, and discussed in Chapter 6, at the heart of a CBR for process synthesis is the knowledge base of past successful process flowsheets. The successful design cases in the

CBR knowledge base are indexed by features as a functional hierarchy shown in Figure 22.

The basic control structure of a CBR, shown in Figure 20, consists of: case retrieval,

selection, adaptation, critiquing and leaming from problem solving. 1 2 6

Synthesis Problem Inputs - raw materials - products - constraints

Retrieve Cases Select Process

Critiquing Knowledge Base Redesigning of Process Cases Failure Handlin

Store modified process

Synthesize Flowsheet Modify Process

Flowsheet for Simulation & Optimization

Figure 26. Control structure of CBR-ProcSyn 127

In the integrated framework CBR is used in two ways:

First, CBR is used as a method for accomplishing a task. For example, for selecting a unit operation for separation the skeletal design task may search a knowledge base of flowsheets.

Second, CBR can also generate the entire process flowsheet without doing the task analysis.

The success of CBR, however, is limited by the number of flowsheets in the case knowledge-base. Besides, if the required design is non-routine or innovative (for a definition of routine and innovative designs see Brown and Chandrasekaran, 1989), CBR may not have sufficient power. So task analysis is still required for many innovative design problems.

7.x. CONTROL STRUCTURE FOR PPD

The execution of the various tasks and subtasks in the task decomposition can be viewed as a planning process such as in PIP (Douglas, 1985; Kirkwood, 1987) and BIOSEP (Siletti,

1990). The control stmcture of the planning process is non-linear because of complex order in which the various subtasks are scheduled for execution during a design case, and the recursive way in which the design failure situations are handled. The non-linear control stmcture is described in the rest of this section (see Figure 27).

The design problem solving (PPD) begins with the skeletal design task. The process

identification task is executed first, followed by the synthesis and selection tasks. At the end of skeletal refinement a design impasse may result if there are no feasible flowsheets. To correct the design impasse, the PPD backtracks to the beginning of the skeletal design task

and asks the user for the respecification of the design problem. The respecification is a task in

itself which will be described at the end of this seciton. After the problem has been

respecified the PPD starts all over again from the beginning. If at the end of skeletal design

there is no impasse, the design refinement task is invoked. 128

The design refinement begins with simulated discovery, and if any constraint failures are discovered, then proceeds to FSR. After FSR, the DSP backtracks to simulated discovery task again. If there are no constraint failures, the PPD proceeds to the final evaluation stage.

However, if some constraints are still not satisfied, causing a non-terminating simulated-discovery-FSR loop, a design impasse is reached. To correct this impasse the partial design is handed over along with the results of simulated discovery to the failure handler. The failure handler in turn tries respecification and invokes the skeletal design task.

If after skeletal design failure handling the simulated-discovery-FSR loops has successfully terminated, then the design failure causing original impasse is rectified.

If, during a failure handling situation, another failure is encountered then a new impasse is declared. This impasse-failure-handling loop is executed recursively until all the impasses have been successfully resolved. If an impasse cannot be resolved, the PPD declares a failure situation leading to respecification. Otherwise, after executing FSR the PPD proceeds to the evaluation task. If the evaluation task causes constraint failure, then the failure handler is invoked.

The task of a failure handler is: given a partial design and the constraint failure information, transfer the PPD to an appropriate task after declaring a failure handling state. If the partial design can be modified successfully so that all the constraints are satisfied, the failure handling state is replaced with normal state of PPD. However, there are times when the failure handling is not successful. If the number of impasses reached during a single failure handling state exceed a maximum allowable number, the respecification task is invoked. If a new failure handling state is created while in a failure handling state then the control of PPD is transferred to the respecification task. There are exceptions for example: simulated-discovery-FSR loop. If after several iterations, there is no change of state then a failure has taken place. The way to fix the failure is:

I. reject the flowsheet entirely and take up an altemative flowsheet 129

2. if there is no other altemative flowsheet (or if all but one flowsheets have been rejected), then begin problem solving with skeletal design.

After several such global iterations, if there is no change in state, then the control goes to the respecification task.

7.8.1. Re.specification

The role of respecification task is to respecify the tasks to handle a particular failure handling state. If knowledge is available in the knowledge-base to synthesize a new control structure

(ex: switch from task-based reasoning to case-based reasoning), then it wUl be applied; otherwise, the user will be informed about the failure and asked for respecification of the design problem.

7.8.2. The Simulated-Dl.scovery-FSR Loop

To illustrate the simulated-discovery-FSR loop we once again draw attention to the examples used whUe describing the simulated discovery task [Figure 28(a)]. The results of

QTS in response to the following Whatif situations are:

“What if the fermentation metabolic pathways produce toxic by-product?”

QLS Discovery: The waste from the process is toxic and violates safety constraint on environmental pollution

“What if the fermentor operates in batch mode and the product has to be produced continuously?”

QLS Discovery: The process violates an operational constraint

The task of FSR is to handle the constraint failures as in the above situations. One way is to

use dependency relationships stated in a general form as: Process Synthesis

Design Refinement Skeletal Design

__^ Identificatio Selection Synthesis Simulated Functional & Flowsheet Evaluation Discovery Structural (SDR) Refinement (FSR

FSESDR Loop

Respecification

W Figure 27. Control structure of PPD framework O 131

If a constraint failure has occurred during a Whatif scenario, then suggest a modification or

invoke a task or reapply a method in the light of the new information generated.

Following the above dependency relationship, to handle the toxic waste scenario the FSR

invokes the skeletal design task with the goal to select a toxic chemical treatment process and

then modify the flowsheet structure. This results in a modified process containing: a

fermentor, a toxic chemical separation process, a centrifuge and a distillation column [see

Figure 28(b)].

The other alternative for FSR is to reinvoke the selection procedure for the centrifuge with

the discovered toxic by-product constraint failure resulting in the proposal of a modified

flowsheet containing: the fermentor, a settling tank with toxic chemical treatment in the place of centrifuge, and a distillation column [see Figure 28(c)]. 132

(a):

Downstream Separation Fermentor Centnfuge Involving Distillation

(b):

Toxic Treatment CentrifugeFermentor

Downstream Separation Involving Distillation

(c):

Fermentor Settling Tank Toxic Treatment

Downstream Separation Involving Distillation

Figure 28. FSR Example CHAPTER VIII

LITERATURE REVIEW

8.1 TAXONOMY OFCHEM ICAL PROCESS SYNTHESIS METHODOLOOIES

The first category of approaches uses mathematical optimization procedures for generating optimal process flowsheets. This approach to integrated process flowsheet design (Grossman, 1985) uses mixed-integer non-linear programming (MINLP).

Optimization variables that represent process units are assigned values of either 1 or 0 to indicate existence of a process unit in the process flowsheet. Constraint equations are fomiulated for all allowable combinations of connections among process units and objective functions are defined for cost, energy, etc. The various problem-solving steps

In a MINLP approach are (Grossman, 1990): (1) A superstructure is postulated (hat has all the combinations of specific flowsheets (2) The superstructure is then modelled as a

MINLP problem and (3) The optimal design is found as a solution of the MINLP problem. As a consequence of starting with a superstructure, which is a precursor to the optimal flowsheets computed by the MINLP approach, it can be observed that the superstructure, itself, is not generated by the MINLP approach. Rather, the superstructure is a postulation. As such the optimal solution obtained is only as good as this superstructure allows (Grossman, 1990). In this context of MINLP solution for process synthesis, the knowledge-based approach described in this dissertation can be viewed as a way of generating the superstructure.

133 134

The second category of computer-aided process synthesis uses planning in a hierarchy of problem spaces by the application of heuristics and evolutionary rules. Lu & Motard,

1985 developed a mle-based system for the synthesis of total flow sheets based on this approach. The rule-based system implements a design plan represented as a hierarchy. At each level in the plan hierarchy, heuristic and evolutionary rules are defined to solve a design subproblem or achieve partial design goals. As the system refines the hierarchy by progressing down from one level to another, it keeps adding and solving new goals that make incremental changes to the partial flowsheet developed at the previous level. The resulting flowsheet at the end of the hierarchical refinement is the final solution. Another example is PIP, a knowledge-based system shell by (Douglas, 1985; Kirkwood, 1987). In

PIP the synthesis plan includes a hierarchy of five decision making levels as: (1) Batch versus continuous process, (2) Input-output structure of the flowsheet, (3) Recycle structure of the flowsheet and reactor considerations, (4) Separation system specification, including vapor and liquid recovery systems and (5) Heat exchanger network synthesis.

PIP uses a variety of heuristics and evolutionary rules to solve the synthesis problem at each level in the plan hierarchy.

The process synthesis framework descibed in this dissertation elaborates the subproblems tasks and methods applicable for the first and second levels of PIP’s planning.

The third category of methods involves non-linear planning approaches where the sequence of process synthesis steps to be executed or design plans are created at run-time (Siletti, 1990). The approach basically uses a hierarchy of design steps to create a design plan. These plans, typically, are comprised of a non-linear combination of primitive design steps.

The bioprocess synthesis problem solving described in Chapters 1 and 2 can be 135

considered as an extension of Siletti’s non-linear planning approach. Our approach,

however, differs significantly in terms of task analysis and the learning functionality.

Finally, the fourth category of methodologies includes the task-oriented approaches

(Gandikota & Davis, 1989; Narayanan, Gandikota & Maroldt, 1990), described in this

dissertation.

The taxonomy serves the purpose of identifying the broad computational bases in process

synthesis problem solving. AI methods involving: planning, task analysis, case-based

reasoning, constraints in particular have a specific purpose in synthesis problem solving:

they are primarily generative. That is, given an open-ended synthesis problem, these

methods provide a systematic way of initiating the search process. However, they are not

guaranteed to find optimum solutions. That is where the mathematical methods are

necessary. The reason, if not simple, is understandable: the optimum is a measure of

quantity but not quality. The expectation of a chemical process to deliver products at

certain volumes, energy utilization, raw material consumption, etc. are succinctly

represented as cost and profit figures. Without a reliable estimate of these numbers, it is

unlikely that a process is implemented in practice.

The appropriate literature supporting this dissertation is largely found in AI.

8.2. CASE-BASED REASONING (CBR)

CBR is applied for scheduling, diagnosis, design, and planning. A survey of these CBR

systems is as follows:

8.2.1. Scheduling

The scheduling problem has to find a feasible allocation of a resource subject to

constraints. Scheduling systems developed using the CBR method are: 136

8.2.1.1. SUPERMOM

SUPERMOM, by Kolodner and Robinson (1990), schedules household tasks.

8.2.1.2. TRUCKER

TRUCKER, by Hammond, Converse and Marks (1988), operates in the domain of

UPS-like delivery. The problem solving task involves finding a schedule for truck deliveries, given a set of delivery requests. Constraints on distance travelled, time taken, etc. are placed to find the best ‘delivery route.’

A case-based reasoning system for resource allocation and scheduling problems, in general, is developed by Koton (1989).

8.2.2. Diagnosis

Given a set of symptoms and a description of a device, the diagnosis problem solving has to find an explanation of the symptoms. Diagnosis systems using CBR are:

8.2.2.1. CASEY

CASEY, by Koton (1988), diagnoses heart problems.

8.2.2.2. PROTOS

PROTOS, by Bareiss et al. (1988), diagnoses hearing disorders

8.2.3. Design

Given a set of goals and constraints, the design problem solving has to create an artifact

that achieves them as well as possible. Design systems using CBR are:

8.2.3.1. CYCLOPS

CYCLOPS, by Navinchandra (1988), in the domain of landscape design, generates

alternative layouts for new neighborhoods. 137

S.2.3.2. KRITIK

KRTTIK, by Goel and Chandrasekaran (1989), integrates model-based reasoning and

CBR for generating design plans, redesign and failure handling.

8.2.4. CLAVIER

CLAVIER, by Barietta and Hennessy (1989), develops spatial layout of parts for curing in an autoclave. Given a list of parts and their priorities, CLAVIER generates a schedule of loads and layouts for each load.

Also, Daube and Hayes-Roth, 1989 have developed a CBR system for redesigning in the domain of mechanical engineering.

8.2.5. Planning

Given a set of goals and constraints, planning involves finding a set of steps to achieve the goals. Some planning systems using CBR are:

8.2.5.1. Battle Planner

Battle Planner, by Goodman (1989), generates battle plans with outcomes and analyses, given intelligence data and operational information.

8.2.5.2. CHEF

CHEF, by Hammond (1986) creates plans about recipes, given desired ingredients, and tastes.

Also, Zhang and Waltz (1989) have developed a CBR system for predicting a protein’s spatial stmcture given a list of components in the protein. Kopeikina, et al. (1988) have applied CBR for continuous control. Mark and Barietta (1987) have used CBR for manufacturing. And Berger (1989) have developed a CBR system for planning radiation therapy. 138

8.3. LEARNING

8.3.1. LEAP

LEAP, by Mitchell, et al. (1985), is a learning apprentice system, an interactive knowledge-based consultant that directly assimilates new problem-solving knowledge by observing and analyzing the problem solving steps contributed by its uses through their normal use of the system, in the domain of VLSI design.

8.3.2. ID3

ID3, by Quinlan (1982), leams decision trees.

8.3.3. PRIDE

PRIDE, by Tanquary and Lu (1991), is an interactive design tool for the capture of design rationale such as; fonnal representations of strategic level design plans; textual explanation for design constraints and relationships among design attributes; histories for design attribute values. The domain of application is clutch design in mechanical engineering. Overall, PRIDE is an interactive design tool for creating a concurrent engineering design environment with many different life-cycle concerns.

8.3.4. Meta-AIMS

Meta-AIMS, by Tcheng and Lu (1991), integrates machine learning algorithms such as: simple recursive splitting, linear recursive splitting, and back-propagation, with optimization strategies, for effective model formulation and utilization for decision making in mechanical engineering design. The learning system uses an incomplete domain theory and examples to induce knowledge missing in the theory. Using subsets of examples, the missing domain knowledge is generated by induction. 139

8.3.5. STRUCT

STRUCT, by Watanabe and Yerramareddy (1991), is a structural learning based approach for 3-D manufacturing feature recognition. Computer-aided design (CAD) systems describe engineering parts in terms of surfaces, edges, and vertices. However, computer-aided manufacturing (CAM) systems typically represent parts in terms of higher-level features such as slots, holes, pockets, and chamfers. The structural decision tree algorithm, STRUCT, helps to bridge the gap between the CAD and CAM systems, by inducing manufacturing features from CAD solid models.

8.3.6. KEDS

KEDS, by Rao, et al. (1991), is a knowledge-based equation discovery system, which uses a model-driven approach to discover equations and thereby build models for engineering design. KEDS is based on a methodology called inverse engineering that works with incomplete models to incrementally refine them.

8.3.7. BRIDGER & BOSS

BRIDGER and BOSS, by Reich (1991), applies machine learning techniques that partially automate design knowledge acquisition. BRIDGER assists in the preliminary design of cable-stayed bridge by decomposing the design into several tasks such as: synthesis, analysis, redesign and evaluation. BOSS manages a group of program specialists operating within a single domain, but differing in their knowledge or background experiences. Each specialist in the group knows best how to solve problems within its own expertise. BOSS leams to allocate problems to its specialists based on their individual experiences.

8.4. GENETIC ALGORITHMS

Applications of Genetic Algorithms in engineering optimization and operations research include systems for: 140

® classroom Scheduling via simulated annealing developed by

Davis and Ritter (1987)

® aircraft landing strut weight optimization, by Minga, 1986

® on-off, steady-state optimization of oil pump pipeline system, by

Goldberg and Kuo, 1987

® recursive adaptive filter design, by Etter, Hicks, and Cho, 1982

® VLSI Circuit layout, by Fourman, 1985

® design of communication networks, by Davis and Coombs, 1987.

@ explicit pattern class recognition using partial matching, by Stadnyk, 1987. CHAPTER IX

CONCLUSIONS AND RECOMMENDATIONS

9.1. CONCLUSIONS

The research described in the dissertation makes the following novel contributions:

# Object-Constraint Network (OCN) approach for handling selection and synthesis

subproblcms of process synthesis

# An inductive learning algorithm for generating process synthesis constraints and '

heuristics from a flowsheet input.

# An algorithm to generate hierarchical classification trees from classes and evidence sets.

The dissertation also describes research done in applying the following AI methods to

process synthesis problem solving:

® Task analysis to decompose the synthesis problem and characterize its search space

# Constraint-based reasoning for handling qualitative synthesis constraints

141 142 .

0 Case-based reasoning for reasoning with previously designed processes and flowsheets

® Genetic algorithms for multicomponent separation sequence synthesis using non-distillation separation processes

The research presented in this dissertation pertains to the preliminary synthesis of processes.

The synthesis problem solving described in this dissertation has one ultimate goal: to generate feasible process alternatives for a given raw material and product specification that are not necessarily optimal.

In computational complexity terms, the process synthesis problem solving is NP-hard or combinatorial. Shown in Gandikota, 1991, for a bioprocess synthesis problem involving a process steps and b unit operations, the complexity is a**b. That is because, at any given stage in the creation of a flowsheet the human designer is faced with the problem of deciding which of the a process steps and b unit operations for each of these process steps are applicable.

Using the knowledge-based approaches described in this dissertation the combinatorial problem solving is made tractable. For instance, the CBR-ProcSyn algorithm for complete process synthesis presented in Chapter 6 and the genetic algorithm for multicomponent separation synthesis (ref Appendix A) can handle open-ended synthesis problems, and are tractable over the cases studied.

The analysis and results from this research help to transform the art of process synthesis into knowledge-based computation. In this regard, the methods are important contributions to chemical engineering science because several aspects of process synthesis such as the complete process synthesis and multicomponent separation synthesis are still not well understood, requiring novel ideas for fully solving these problems. 143

The case-based reasoning method, in particular, is suitable for retrofitting, i.e. modification of processes in existing plants using latest technology. Case-based reasoning complements the numerical optimization methods that can optimize processes when energy costs are high

(by energy integration), and raw materials are scarce (by recycling), but cannot make process modifications.

The potential of a computer system to learn the process synthesis knowledge as constraints, and generate heuristics automatically is demonstrated for the first time in this dissertation.

There has been no proposal in the process synthesis literature to use machine learning methods for any aspect of process synthesis. The inductive learning algorithm described in

Chapter 3 helps the Icnowledge engineer overcome the so called “knowledge transfer bottleneck” — the difficulty in transferring knowledge from the expert to the knowledge base— by:

1. automatically transferring process synthesis knowledge from the flowsheets reported in the literature to the system without the help of an expert, and

2. having the potential to discover design knowledge as constraints and heuristics.

Similarly, the HCl algorithm presented in Chapter 5 for the creation of classification trees helps the knowledge engineer to generate hierarchies, in spite of incomplete knowledge for classification and non-avaUability of deep models.

To integrate the various methods, the dissertation describes a novel framework for process synthesis problem solving. The framework is based on a systematic task analysis of process synthesis and a smorgasbord of methods indexed by specific tasks. The framework incorporates simulation to test flowsheets; has a control structure like a non-linear plan; and emulates the concurrent engineering design practices.

While the task framework for process synthesis problem solving was a hard pursuit, its implementation for a computer-aided design tool is even harder to accomplish in a limited 144

time of one Ph.D. To this end, our vision of knowledge-based process synthesis problem

solving exceeds our grasp. So, the dissertation delineates the functional details of the task

structures and the methods within the framework using practical illustrations, where a prototype implementation is not made, to help the development of knowledge-based design tools in the future.

The knowledge-based methods described here are not without limitations. The limitations of our knowledge-based approach can be best summarized by paraphrasing Professor C.

Judson King (King, 1974, page 9):

“Because of the inherent complexity, the typical open endedness, the defiance of totally quantitative description and the frequent emphasis upon novelty in chemical process design situations, it is important to seek methods which will add stmcture and logical direction to process design and engineering but at the same time will fall short of yielding a programmed synthesizer.”

An easily recognizable limitation of these methods is they cannot propose a novel unit operation. The methods rely on existing unit operations, and can apply in a novel way a particular unit operation for a given substance. However, they caimot create a unit operation that is not known to the designers. Thus the process synthesis described here is not really an open-ended problem, which designers can handle as a way of innovation or discovery. So a knowledge-based system today cannot really substitute a designer; the system is at best an apprentice under the designer. The system, however, can only be as good as the human designer in the sense of having the ability to systematically generate process altematives knowledge-based search.

These methods can only change the morphology of a sequence of processes connected as a network, but not the stmcture and function of individual processes. For instance, the methods caimot propose new membranes in membrane separators to improve their separation efficiency, or changes to distillation columns to improve their energy efficiency. This can be 145 explained in another way. The methods do not have access to the deep models about chemical and physical phenomena underlying process behaviors. Without thorough understanding of phenomena like diffusion, absorption, adsorption and mixing it is not possible to propose improvements to separation processes based on those phenomena.

9.2. RECOMMENDATIONS

The remaining work for this dissertation includes further exploration of the knowledge-based methods proposed as following:

© In the overall knowledge-based framework for process synthesis the case-based approach will be an alternative to task-based approach for selection and synthesis. Further research is necessary on how to integrate the two so that if one fails to synthesize a feasible flowsheet the other will guarantee success.

© There are two ways currently learning has been proposed: by learning process synthesis knowledge as constraints using flowsheets, and by learning process cases after flowsheet synthesis in case-based reasoning. Learning of constraints, however, is more involved than learning of cases. The case-based reasoner can directly apply the modification procedures on the flowsheets. Whereas, the constraints are generated by inductive methods using the previously learned constraints in the knowledge base. In spite of differences, these two are more or less related. So further research is necessary to bridge the inductive and case-based methods. Also, the constraint learning algorithm can be further improved by using

‘interestingness measure’ of the learned result.

© The concept of design history needs further research. A design history is a trace of the knowledge-based system’s problem solving. It is a detailed record of data, constraints, cases, etc. that are applied in generating a process. The design history is also meant for explaining the results of the knowledge-based problem solving and failure handling. During 146 a failure, the design history (viewed as a stack) can be accessed to backtrack to a prior decision that is suspected to be the cause of failure.

# Qualitative simulation is proposed for validating process flowsheets. Simulation serves as a useful diagnostic tool to detect, early during the conceptual design stage, any possible malfunctions with a process to avoid computationally expensive redesign later. Currently, no convenient tools are available for this purpose. QSIM is considered for this purpose, but rejected because of its complex representation. The simulation tool embedded in G2

(Gensym) seems ideal for this purpose. Similarly, appropriate quantitative simulation tool that can be easily integrated with the knowledge-based system is required (ex: ASPEN).

0 Optimization routines for sizing various equipments, computing stream flowrates, energy requirements, etc. with and without short cut procedures need to be incorporated in the framework. Routines for estimating the capital and operating costs of processes are also needed. Some may be available in the public domain from ChemShare.

® Using HCl algorithm a complete knowledge acquisition tool for diagnosis and selection problem solving is now possible.

@ Genetic algorithms can be further investigated for large-sale optimization problems in chemical engineering, besides separation systems synthesis, such as: heat exchanger network synthesis and piping.

@ A computer-based process synthesis instruction tool that facilitates easy access and understanding of available processes on workstations, based on the integrated process synthesis framework will be very useful for extracting the rationale behind process synthesis; also for generating explanations about process selection and synthesis. BIBLIOGRAPHY

1. Alcantara, B., A.W.Westerberg & M.D.Rychener, Development of an Expert System for Physical Property Predictions, Computers & Chemical Eng., 9, pp. 127- 142, 1985.

2. Alcantara, B., E.I.Ko, A.W.Westerberg & M.D.Rychener, DECADE-A Hybrid Expert « System for Catalyst Selection, Computers & Chemical Eng., 12(9/10), 1988.

3. Bareiss, E. R., B. W. Porter and C. C. Wier, The exemplar-based learning apprentice, AI Laboratory Technical Report#AI87-53, The University of Texas, Austin, 1988.

4. Barietta, R. and Hennessy, D., Case adaptation in autoclave layout design, proceedings of the second workshop on case-based reasoning, Pensacola Beach, FL., 1989.

5. Bamicki, S.D. and Fair. J R., Separation System Synthesis: A Knowledge-Based Approach. 1. Liquid Mixture Separations, Ind. Eng. Chem. Res., 29(3), 1990.

6. Berger, J., ROENTGEN: a case-based approach to radiation therapy planning, proceedings of the second workshop on case-based reasoning, Pensacola Beach, FL., 1989.

7. Brown, D C. and B.Chandrasekaran, Design Problem-Solving: Knowledge Structures and Control Strategies, M. Kaufmann Publishers, Los Altos, Calif,, 1989.

8. Bylander, T. and S. Mittal, CSRL: A language for classificatory problem solving and uncertainty handling, AI Magazine, August, 1986.

9. Chandrasekaran, B., Generic Tasks in Knowledge-Based Reasoning: High Level Building Blocks for Expert System Design, IEEE Expert, 1, 23, 1986.

10. Chandrasekaran, B., Design problem solving: A task analysis, AI Mag., 11(4), 1990.

11. Davis, L. and F. Ritter, Schedule of optimization with probabilistic search, Proc. of the 3rd IEEE conference on AI applications, pp. 231-236, 1987.

12. Daube, F. and Hayes-Roth, B., A case-based mechanical redesign system. Proceedings of DCAI-89, Detroit, 1989.

13. Dayal, M., M.S. Thesis, Department of Chemical Engineering, Ohio State University, 1990.

14. Douglas, J., A Hierarchical Decision Procedure for Process Synthesis, pp.353, volume 31, number 3, AIChE Journal, March, 1985.

15. Douglas, J.M., Conceptual Design of Chemical Processes, McGraw-Hill Book Company, 1988.

147 148

16. Davis, L. and S. Coombs, Optimizing network link sizes with genetic algorithms, in Modelling and Simulation Methodology: Knowledge Systems Paradigms, Maurice S. Elzas, Tuncer I. Oren and Bernard P. Zeigler editors. North Holland Publishing Co., 1987. 17. Etter, D.M.,M. J. Hicks and K.H. Cho, Recursive adaptive filter design using an adaptive genetic algorithm, Proc. of IEEE international conference on acoustics, speech and signal processing, 2, pp. 635-638, 1982. 18. Fourman, M.P., Compaction of symbolic layout using genetic algorithms, Proc. of an international conference on genetic algorithms and their applications, pp. 141-153,1985. 19. Gandikota, M.S., Expert Systems for Selection Problem Solving Using Classification and Critiquing, M.S.Thesis, Department of Chemical Engineering, Ohio State University, 1988. 20. Gandikota, M.S. and J.F.Davis, An Expert System Framework for the Preliminary Design of Process Flowsheets, Proceedings of Knowledge Based Computer Systems Conference, pp.88-104, Bombay, India, 1989. 21. Gandikota, M.S., J.F.Davis and S.T.Yang, A Task Approach to Knowledge-Based Systems for Process Selection and Synthesis, paper submitted to Industrial and Engineering Chemistry Research, March, 1991. 22. Gandikota, M.S., S.T.Yang, J.F.Davis and J.Marchio , Knowledge-Based System for Bioprocess Selection and Synthesis, paper submitted to ChemTech, April, 1991. 23. Goel, A. and Chandrasekaran, B., Use of Device Models in Adaptation of Design Cases, Proceedings of the Second Workshop on Case-Based Reasoning, Pensacola Beach, FL., 1989. 24. Goel, A., Integration of Case-Based Reasoning and Model-Based Reasoning for Adaptive Design Problem Solving, Ph.D. Thesis, Department of Computer and Information Science, The Ohio State University, 1989. 25. Goldberg, D. E. and C. H. Kuo, Genetic algorithms in pipeline optimization. Journal of computers in civil engineering, 1(2), pp. 128-141,1987. 26. Gomez, A. and J.D. Seader, Separation sequences synthesis by a predictor based ordered search, AIChE Jr., 22(6), pp. 970,1976. 27. Goodman, M., CBR in battle planning, proceedings of the second workshop on case-based reasoning, Pensacola Beach, FL., 1989. 28. Grefenstette, J. J., Genetic Algorithms and Simulated Annealing, Davis, L. (Ed.), Morgan Kaufinarm publishers inc., 1987. 29. Grossman, I.E., Mixed-Integer Programming Approach for the Synthesis of Integrated Process Flowsheets, in pp. 463-482, volume 9, number 5, Computers & Chemical Engineering, 1985. 30. Grossman, I.E., MINLP Optimization Strategies and Algorithms for Process Synthesis, Foundations of Computer-Aided Process Design (FOCAPD-90), Snowmass, Colorado, July, 1989 (Elsevier Science Publishers Inc., 655 Avenue of the Americas, New York, NY 10010, U.S.A.). 31. Hammond, K., CHEF: A model of case-based planning. Proceedings of AAAI-86, Philadelphia, PA., 1986. 149

32. Hammond, Converse and Marks, Technical Report, Department of Computer and Information Science, University of Chicago, 1988. 33. Holland, J.H., Adaptation of natural and artificial systems, Ann Arbor: University of Michigan Press, 1975. 34. Keene, S. E., Object-oriented programming in Common Lisp: a ’s guide to CLOS, Addison-Wesley publishing co., 1989. 35. King, C.J., Understanding and conceiving chemical processes, AIChE monograph series, No.8, Vol. 70, 1974. 36. Kirkpatrick, S., C. D. Gelatt and M.P. Vecchi, Optimization by simulated annealing. Science, 220, pp. 671-680, 1983. 37. Kirkwood, R.L., PIP-Process Invention Procedure a Prototype Expert System for Synthesizing Chemical Process Flowsheets, Ph.D. Thesis, Department of Chemical Engineering, University of Massachusetts, May, 1987. 38. Kolodner and Robinson, Technical Report, Department of Computer and Information Science, Georgia Institute of Technology, Atlanta, 1990. 39. Kopeikina, Ludmila, Bandau, Richard, & Lemmon, Alan, Case-based reasoning for continuous control. Proceedings of the workshop on case-based reasoning (DARPA), Morgan-Kaufmann Publishers, Inc., San Mateo, CA, 1988. 40. Koton, P., SMARTplan: A case-based resource allocation and scheduling system, proceedings of the second workshop on case-based reasoning, Pensacola Beach, FL., 1989. 41. Koton, P., Reasoning about evidence in causal explanation, proceedings of AAAI-88, pp. 256-261, 1988. 42. Lu, M.D. and R.L.Motard, Computer-Aided Total Flowsheet Synthesis, pp. 431- 445, volume 9, number 5, Computers & Chemical Engineering Joumal, 1985. 43. Mark, W. and R. Barietta. Case-based reasoning in manufacturing. Manuscript, Lockheed AI Center, Palo Alto, CA, 1987. 44. Mark, W., Case-based reasoning for autoclave management, proceedings of the second workshop on case-based reasoning, Pensacola Beach, FL., 1989. 45. Meszaros, I. and Z. Fonyo, A new bounding strategy for synthesizing distillation schemes with energy integration. Computers & Chemical Engineering, 10(6), pp.545-550, 1986. 46. Minga, A. K., Genetic algorithms in aerospace design, paper presented at the AIAA southeastern regional student conference, Huntsville, AL, April, 1986. 47. Mitchell, T., Mahadevan, S. and L. Steinberg, Leap: A learning apprentice system for VLSI design, Proc. of UCAI-85, Morgan Kaufmann, August, 1985. 48. Morari, M and I.E. Grossman, Design of an ammonia synthesis plant, CACHE process design case studies. Vol. 2, 1985. 49. Narayanan, H., M.S.Gandikota & J.Maroldt, An Integrated Framework for Intelligent Computer Aided Design of Chemical Processes, Proceedings of Industrial & Engineering Applications of AI, Charleston (SC), July, 1990. 150

50. Navinchandra, D., Case-Based Reasoning in CYCLOPS, A Design Problem Solver, Proceedings of the Case-Based Reasoning Workshop (DARPA), Morgan-Kaufmann Publishers Inc., San Mateo, CA., 1988. 51. Petrides, D., C.L.Cooney, L.B.Evans, R.P.Field & M.Snoswell, Bioprocess Simulation: An Integrated Approach to Process Development, Computers & Chemical Eng., 13(4/5), 1989. 52. Pibouleau, L., A. Said and S. Domenech, Synthesis of optimal and near-optimal distillation sequences by a bounding strategy. The Chem. Eng. Jr., 27, pp.9-19, 1983. 53. Quinlan, J. R., Semi-autonomous acquisition of pattern-based knowledge, in Introductory Readings in Expert Systems, ed. Donald Mitchie, Gordon & Breach, 1982. 54. Ramesh, T.S., Ph D. Dissertation, Chemical Engineering, The Ohio State University, 1989. 55. Rao, R. B., S. C-Y. Lu and R. E. Stepp, Knowledge-based equation discovery in engineering domains, Proc. of the eighth international workshop on machine learning, 1991. 56. Redmond, M., Learning from Others’ Experience: Creating Cases from Examples, Proceedings of the Second Workshop on Case-Based Reasoning, Pensacola Beach, FL, 1989. 57. Reich, Y, Designing integrated learning systems for engineering design, Proc. of the eighth international workshop on machine learning, 1991. 58. Rodrigo, F. R. and J.D. Seader, Synthesis of separation sequences by ordered branch search, AIChE Jr., 21(5), pg. 885, 1975. 59. Seader, J.D. and A. W. Westerberg, A combined heuristic and evolutionary strategy for synthesis of simple separation sequences, AIChE Jr., 23(6), pp. 951, 1977. 60. Sembugamoorthy, V. and B. Chandrasekaran, Functional representation of devices and compilation of diagnostic problem solving systems. Experience, Memory and Reasoning, J.L. Kolodner and C. K. Riesbeck (Eds.), Lawrence Erlbaum Associates, 1986. 61. Siirola, J.J. and D.F.Rudd, Computer-aided synthesis of chemical process designs, Indust. Engng. Chem. Fundam., 10, 353, 1971. 62. Siletti, C.A., Design of Protein Purification Processes by Heuristic Search, Artificial Intelligence in Process Engineering, pp.295-310. Academic Press Inc., 1990. 63. Sriram, D., G.Stephanopoulos, R.Logcher, D.Gossard, N.Groleau, D.Serrano & D.Navinchandran, Knowledge-Based System Applications in Engineering Design: Research at MIT, AI Magazine, pp.79-96,10(3), pAl, 1989. 64. Stadnyk, 1., Schema recombination in pattern recognition problems. Genetic algorithms and their applications: Proc. of the second international conference on genetic algorithms, pp. 27-35, 1987. 65. Steele, G. L., Common Lisp: The Language, Digital Press, 1984. 66. Stephanopoulos, G., et al., Design-Kit: An Object-Oriented Environment for Process Engineering, Computers & Chemicd Eng., pp.655-674, 11(6), 1987. 67. Tanquary, J. and S. C-Y, Lu, Capturing design rationale in an interactive design environment, KBESRL Armual Report, Department of Mechanical and Industrial Engineering, University of Illinois, Urbana-Champaign, 1991. 151

68. Tcheng, D. K. and S. C-Y, Lu, Improving the performance of AIMS through meta-leaming, KBESRL Annual Report, Department of Mechanical and Industrial Engineering, University of Illinois, Urbana-Champaign, 1991. 69. Tcheng, D. K., B. L. Lambert, S. C-Y. Lu, AIMS: AN interactive modeling system for supporting engineering decision making, Proc. of the eighth international workshop on machine learning, 1991. 70. Thompson, R.W. and C. J. King, Systematic synthesis of separation schemes, AIChE Jr., 18(5), pg. 941,1972. 71. Watanabe, L. and S. Yerramareddy, Decision tree induction of 3-D manufacturing features, Proc. of the eighth international workshop on machine learning, 1991. 72. Westerberg, A.W., The synthesis of distillation-based separation systems. Computers & Chemical Engineering, 9(5), pp. 421-429, 1985. 73. Wheelwright, S., The design of downstream processes for large-scale protein purification, Joumal of Biotechnology, 11, 89-102, 1989. 74. Whitehall, B. L. and S. C-Y. Lu, A study of how domain knowledge improves knowledge-based learning systems, Proc. of the eighth international workshop on machine learning, 1991. 75. Zhang, Y., Zou, H. and Lu, P., Advances in expert systems for high-performance liquid chromatography, J. Chrom., 515, 13-26,1990. 76. Zhang, X. and Waltz, D., Protein Structure Prediction using Memory-Based Reasoning: A case study of data exploration. Proceeding of the Second Workshop on Case-Based Reasoning, Pensacola Beach, FL, 1989. APPENDIX A.

KNOWLEDGE BASE FOR BIOPROCESS SYNTHESIS

(make-Specialist ’Adsorption ’((parent product-recovery) (children nil) (preliminary-selection-constraints (Argument-1 “Antibiotics (ex: penicülin) can be treated using adsorption” ADSORPTION-PRODUCTS-CONSTRAINT)) (establish-test (((T) —> 10) (else —> 0))) (secondary-selection-constraints (Argument-1 “Capital cost for adsorption is veryhigh compared with precipitation or extraction” (ADSORPTION-CAPITAL-COST-CONSTRAINT) (from (precipitation extraction) to adsorption type negation)))))

(make-Specialist ’air-lift-fermentor ’((parent fermentation)(children nil) Cpreliminary-selection-constraints (Argument-1 “Air-lift fermentors are used forlarge-scale aerobic fermentations such as in the cases of antibiotics, proteins, enzymes, citric acid, and polysaccharide”FERMENTATION-TYPE-CONSTRAINT CAPACITY-CONSTRAINT AIRLIFT-PRODUCTS-CONSTRAINT))(establish-test (((T T T) —> 10)(else —> 0))) (secondary-selection-constraints (Argument-1 “Air-lift fermentor is a preferredaltemative if mechanical agitation is prohibited due to high shear or not possible (ex: single cell protein production)” (SHEAR-FORCES-CONSTRAINT AIRLIFT-FEASIBILITY-CONSTRAINT) (from mechanical-agitation-fermentorto air-lift-fermentor type reinforcement)) (argument-2 “Mixing is not as good as in mechanical agitation” (MIXING-CONSTRAINT) (from mechanical-agitation-fermentor to air-lift-fermentor typenegation)) (ARGUMENT-3 “Less energy intensive than mechanical agitation” (ENERGY-CONSTRAINT) (from air-lift-fermentor to mechanical-agitation-fermentor type negation)))))

(make-Specialist ’ball-milling ’((parent cell-disruption)(children nil) (preliminary-selection-constraints(Argument-l “Ball milling may denature protein product due to shear heating” PROTEIN-DEGRADATION-CONSTRAINT) (argument-2 “Ball milling is preferable for Yeast whose cell walls are thicker and

152 153

harder to break’TRODUCT-YEAST-CONSTRAINT)) (establish-test(((T T) —> 10) (else —> 0))) (secondary-selection-constraints (Argument-1 “Ball milling is of high capital costand operating cost compared with chemical and enzymatic treatment” (CELL-DISRUPTION-COST-CONSTRAINT) (from ball-milling toO(chemical-treatment enzyme-treatment) type reinforcement)))))

(make-Specialist ’cell-dismption '((parent bio-operations)(children homogenization ball-milling enzyme-treatment chemical-treatment)(preliminary-selection-constraints (Argument-1 “Cell disruption is necessary for recovery of intracellular products such as in recombinant protein fermentation (ex: insulin)”))(establish-test (((T) —> 10)(else —> 0))) (secondary-selection-constraints)))

(make-Specialist ’ImmobUized-ceU-fermentor ’((parent fermentation)(chüdren nU) (preliminary-selection-constraints(Argument-1 “Used to reduce production costs in the case of large capacity continuous processes such ethanol and biological waste treatment” CAPACITY-CONSTRAINT MODE-OF-OPERATION-CONSTRAINTETHANOL-STREAM-CONSTRAINT WASTE-STREAM-C0NSTRAINT)(argument-2 “Achieves high cell density due to immobilization—so provides faster fermentation”HIGH-CELL-DENSITY-CONSTRAINT FASTER-FERMENTATION-CONSTRAINT)(ARGUMENT-3 “Can mn reactor continuously and eliminate down time” MODE-OF-OPERATION-CONSTRAINT D0WN-TIME-C0NSTRAINT)(ARGUMENT-4 “Product should be primary metabolite” PRODUCT-PRIMARY-METABOLITE-CONSTRAINT))(establish-test (((T T T T T T T T T) —> 10)(else —> 0))) (secondary-selection-constraints)))

(malce-Specialist ’Cell-Separation ’((parent bio-operations)(children centrifugation filtration sedimentation) (preliminary-selection-constraints(Argument-l “Is cell separation operation desired”))(establish-test (((T) —> 10)(else —> 0))) (secondary- (establish-test (((T T T) —> 10)(else —> O)))(secondary-selection-constraints (ARGUMENT-1 “For desalting of (high concentration)protein solutions filtration is slow compared to electrodialysis” (FILTRATION-SLOW-PROCESS-CONSTRAJNT) (from electro-dialysis to fÜtrationtype negation)))))

(make-Specialist ’Enzyme-Treatment ’((parent cell-disruption) (children nü) (preliminary-selection-constraints(argument-l “Enzyme treatment can cause 154

denaturation/degradation of protein” PROTEIN-DEGRADATION-CONSTRAINT))(establish-test (((T)—> 10)(else —> O)))(secondary-selection-constraints (ARGUMENT-1 “Very expensive—more expensive thanchemical treatment” (CELL-DISRUFTTON-COST-CONSTRAINT) (from chemical-treatment to enzyme-treatment type negation)))))

(make-specialist ’Extraction ’((parent product-recovery) (children nil) (preliminary-selection-constraints(ARGUMENT-1 “Extraction is applicable whendistillation is not possible due to material properties or economic reasons- And when large volumes have to be treated” DISTILLATION-CONSTRAINT EXTRACTION-CAPACITY-CONSTRAINT) (ARGUMENT-2 “Extraction is widely used for biopolymers and antibiotics like penicillin, erithromycin and tetracycline” EXTRACTION-PRODUCTS-CONSTRAINT)(ARGUMENT-3 “Centrifugal extractors can handle highly emulsified systems and liquids with small density differences” EMULSinED-SYSTEMS-CONSTRAINTS SMALI^DENSITY-DIFFERENCE-CONSTRAINT)(ARGUMENT-4 “Short phase residence time minimizes the risk of solute degradation by heat, hydrolysis or enzyme reaction” SOLUTE-DEGRADATION-CONSTRAINT)(ARGUMENT-5 “Offers continuous processing of feed” ADSORPTION-CONTINUOUS-PROCESSING-CONSTRAINT)(ARGUMENT- 6 “For extraction a solvent with affinity towards product should be available” ADSORPTION-SOLVENT-CONSTRAINT))(establish-test (((T T T T T T T T) —> 10)(else —> O)))(secondary-selection-constraints (ARGUMENT-7 “Extraction is less expensive thanadsorption” (ADSORPTION-ECONOMY-CONSTRAINT) (from adsorption to extraction type negation)))))

(make-Specialist ’Fermentation ’((parent bio-operations) (children mechanical-agitation-fermentor air-lift-fermentor immobilized-cell-fermentor)(preliminary-selection-constraints (ARGUMENT-1 “Is fermentation operation desired”))(establish-test (((T) —> 10(else —> 0))))))

(make-Specialist ’distillation ’((parent product-recovery) (children nil) (preliminary-selection-constraints(ARGUMENT-l “Is distillation operation desired”)) (establish-test(((T) —> 10 (else —> 0))))))

(make-Specialist ’Homogenization ’((parent cell-disruption) (children nil) (preliminary-selection-constraints(ARGUMENT-1 “For bacteria homogenization is the most preferred method” BACTERIAL-PRODUCT-CONSTRAINT))(establish-test (((T) —> 10)(else —> 155

0))) (secondaiy-selection-constraints(ARGUMENT-l “Preferable than ball milling if dénaturation or degradation of protein has to be avoided” (PROTEIN-DEGRADATION-CONSTRAINT) (from homogenization to ball-millingtype negation))(ARGUMENT-2 “Capital and operating costs may be high compared with chemical and enzymatic treatment” (CELL-DISRUPTION-COST-CONSTRAINT) (from (chemical-treatmentenzyme-treatment) to ball-milling type negation)))))

(make-Specialist ’Mechanical-Agitation-fermentor '((parent fermentation) (children nü) (preliminary-selection-constraints(ARGUMENT-l “Provides good mixing” MIXING-CONSTRAINT) (ARGUMENT-2 “Applicable for anerobic fermentation such as ethanol and organic acids” FERMENTATION-TYPE-CONSTRAINTETHANOL-STREAM-CONSTRAINT ORGANIC-ACID-STREAM-CONSTRAINT) (ARGUMENT-3 “Energy intensive due to impeller which needs power” ENERGY-CONSTRAINT) (ARGUMENT-^ “Causes large shear forces” SHEAR-FORCES-CONSTRAINT))(establish-test (((T T T T) — > 10)(else — > 0))))) (make-Specialist ’media-Steiilization ’((parent bio-operations) (chUdren thermal-sterUizationmembrane-sterüization chemical-sterilization) (preliminary-selection-constraints (ARGUMENT-1 “Is a media sterUization operationdesired”)) (establish-test(((T) —> 10) (else —> 0)))))

(make-Specialist ’membrane-SterUization ’((parent media-sterUization) (chUdren nü) (preliminaiy-selection-constraints(ARGUMENT-l “Is a membrane sterilization operation desired”))(establish-test (((T) —> 10)(else —> 0)))))

(make-Specialist ’Membrane-Fütration ’((parent media-sterüization) (chüdren nü) (preliminary-selection-constraints(ARGUMENT-l “Membrane separation is better for primary recovery” PRIMARY-RECOVERY-CONSTRAINT) (ARGUMENT-2 “Various types of cells such as viruses, bacteria, streptomyces, yeasts and mammalian cells can be separated using filtration” MEMBRANE-FILTRATION-PRODUCTS-CONSTRAINT)(ARGUMENT-3 “For large scale operation membrane separation offers significant savings” CAPACITY-CONSTRAINT)(ARGUMENT-4 “If the product is extracellular membrane separation is preferable” EXTRACELLULAR-PRODUCT)(ARGUMENT-5 “If different proteins of same size have to be separated then membrane separation is not applicable” SAME-SIZE-PROTEINS-CONSTRAINT))(establish-test (((T T T T T) —> 10)(else —> O)))(secondary-selection-constraints (ARGUMENT-1 “Operation of a membrane filter ismore tedious than centrifuge due to membrane fouling and plugging” (OPERATIONAL-DIFFICULTY-CONSTRAINT) (from centrifugation 156

tomembrane-filtration type negation)))))

(make-Specialist ’Precipitation ’((parent product-recovery) (children nil) (preliminary-seIection-constraints(ARGUMENT-l “For precipitation a chemical with affinity towards product should be available” PRECIPrrATION-CHEMICAL-CONSTRAINT)(ARGUMENT-2“Poly saccharide, lactic acid (using calcium) and protein (using polymer or sodium) can be precipitated” PRECIPITATION-PRODUCTS-CONSTRAINT-1 )(ARGUMENT-3 “Penicillin cannot be precipitated” PRECIPITATION-PRODUCTS-CONSTRAlNT-2))(establish-test (((T T T) —> 10)(else —> 0))) (secondary-selection-constraints(ARGUMENT-l “Cheapest way to recover product than other product recovery methods”(PRECIPlTATION-ECONOMY-CONSTRAINT) (from precipitation to (extraction adsorption absorption crystallization centrifugation distUlation)type negation)))))

(make-Specialist ’absorption ’((parent product-recovery) (children nil) (preliminary-selection-constraints(ARGUMENT-1 “Is absorption operation desired”)) (establish-test(((T)—> 10 (else —> 0))))))(make-Specialist ’Product-Recovery ’((parent bio-operations)(children extraction precipitation adsorption distillation absorption crystallization centrifugation) (preliminary-selection-constraints (ARGUMENT-1 “Is a product recovery operationdesired”)) (establish-test (((T) —> 10) (else —> 0)))))

(make-Specialist ’Purification ’((parent bio-operations) (children ultrafUtration chromatography electro-dialysis crystallization)(prelimmary-selection-constraints (ARGUMENT-I “Is a purification operationdesired”)) (establish-test(((T) —> 10) (else —> 0)))))(make-Specialist ’ultrafUtration ’((parent concentration)(chUdren nU) (preliminary-selection-constraints(ARGUMENT-1 “Is ultrafUtration operation desired”))(establish-test (((T)—> 10)(else —> 0)))))

(make-Specialist ’Sedimentation ’((parent cell-separation) (chUdren nU) (preliminary-selection-constraints(ARGUMENT-l “Size of particles should be big enough for precipitation” PARTICLE-SIZE-CONSTRAINT)(ARGUMENT-2 “Applicable for large scale processing of liquids (ex: biological waste treatment)” CAPACITY-CONSTRAINT WASTE-STREAM-CONSTRAINT)(ARGUMENT-3 “Microorganisms in the feed should be able to flocculate or aggregate” FLOCCULAT[ON-CONSTRAINT))(establish-test (((T T T T) —> 10)(else —> 0))) (secondary-selection-constraints(ARGUMENT-l “Sedimentation is a very slow process compared with centrifugation or membrane 157

mtration”(SEDIMENTATION-SLOW-PROCESS-CONSTRAINT) (from (centrifugationmembrane-filtration) to sedimentation type negation)))))

(make-specialist ’Thermal-Sterilization '((parent media-sterUization) (children nil) (preliminary-selection-constraints(ARGUMENT-l “Due to thermal heating large proteins if present in the sterilizer feed may coagulate” LARGE-PROTEINS-CONSTRAINT)(ARGUMENT-2 “If both protein & carbohydrate are present in the sterilizer feed side reactions are possible in thermal sterilization” LARGE-PROTEINS-CONSTRAINT CARBOHYDRATES-CONSTRAINT))(establish-test (((T T T) —> 10)(else —> 0)))))(make-Specialist 'polishing '((parent bio-operations)(children drying) (preliminary-selection-constraints(Argument-l “Is polishing operation desired”))(establish-test (((T) —> 10)(else —> 0))) (secondary-selection-constraints)))

(make-Specialist 'drying '((parent polishing) (children spray-drying) (preliminaiy-selection-constraints(Argument-1 “Is drying operation desired”))(establish-test (((T) —> 10)(else —> 0))) (secondary-selection-constraints)))

(make-Specialist 'spray-drying '((parent drying) (children nil) (preliminary-selection-constraints(Argument-l “Is spray drying operation desired”))(establish-test (((T) —> 10)(else —> 0))) (secondary-selection-constraints)))

(make-Specialist 'concentration '((parent bio-operations) (children reverse-osmosis evaporation ultrafiltration) (preliminary-selection-constraints(Argument-l “Is concentration operation desired”))(establish-test (((T) —> IO)(else —> 0))) (secondary-selection-constraints)))

(make-Specialist 'reverse-osmosis '((parent concentration) (children nil) (preliminary-selection-constraints(Argument-1 “Is reverse osmosis operation desired”))(establish-test (((T) —> IO)(else —> 0))) (secondary-selection-constraints )) )

(make-Specialist 'evaporation '((parent concentration) (children nil) (preliminary-selection-constraints(Argument-1 “Is evaporation operation desired”))(establish-test (((T) —> IO)(else —> 0))) (secondary-selection-constraints))) 158

(make-specialist ’Bio-operations ’((parent nil)(children media-sterilization fermentation cell-separation cell-disruption product-recovery concentration purification polishing)(preliminary-selection-constraints (argument-1 “Is bio operations established’’))(establish-test (((T) —> 10) (else —> 0))) (specslist LARGE-PROTEINS-CONSTRAINT CARBOHYDRATES-CONSTRAINT WASTE-STREAM-CONSTRAINT CAPACITY-CONSTRAINT MODE-OF-OPERATION-CONSTRAINTETHANOL-STREAM-CONSTRAINT HIGH-CELL-DENSITY-CONSTRAINT FASTER-FERMENTATION-CONSTRAINTDOWN-TIME-CONSTRAINT PRODUCT-PRIMARY-METABOLITE-CONSTRAINT MIXING-CONSTRAINTFERMENTATION-TYPE-CONSTRAINT ORGANIC-ACID-STREAM-CONSTRAINTENERGY-CONSTRAINT SHEAR-FORCES-CONSTRAINT AIRLIFT-PRODUCTS-CONSTRAINT AIRLIFT-FEASIBILITY-CONSTRAINT CHEMICAL-IMPURITY-CONSTRAINT THERMAL-STERILIZATION-FEASIBILITY-CONSTRAINT PROTEIN-DEGRADATION-CONSTRAINT CHEMCAL-TREATMENT-COST-CONSTRAINT CELL-DISRUPTION-COST-CONSTRAINT BACTERIAL-PRODUCT-CONSTRAINTPRODUCT-YEAST-CONSTRAINT PRIMARY-RECOVERY-CONSTRAINT MEMBRANE-FILTRATION-PRODUCTS-CONSTRAINT OPERATIONAL-DIFFICULTY-CONSTRAINTEXTRACELLULAR-PRODUCT SAME-SIZE-PROTEINS-CONSTRAINTPARTICLE-SIZE-CONSTRAINT FLOCCULATION-CONSTRAINT SEDIMENTATION-SLOW-PROCESS-CONSTRAINT CENTRIFUGATION-CELL-DISRUPTION-CONSTRAINT INTRACELLULAR-PRODUCT-CONSTRAINT CENTRIFUGATION-COST-CONSTRAINT PROTEIN-DESALTING-CONSTRAINT SELECTED-FRACTIONATION-CONSTRAINT REAGENT-RECOVERY-CONSTRAINT PROTEIN-PURIFICATION-CONSTRAINT LOW-SALT-LEVEL-DESALTING-CONSTRAINT FILTRATION-SLOW-PROCESS-CONSTRAINT PURIFICATION-CAPACITY-CONSTRAINT PURIFICATION-PROTEIN-MIXTURES-CONSTRAINT SIMPLE-OPERATION-CONSTRAINT PURIFICATION-ANTIGENS-ANTIBODIES-CONSTRAINT PURIFICATION-PROTEIN-CRUDE-EXTRACTS-CONSTRAINT CRYSTALLIZATION-PRODUCT-CONSTRAINT 159

PRECIPITATION-CHEMICAL-CONSTRAINT PRECIPITATION-PRODUCTS-CONS'mAINT-1 PRECIPITATION-PRODUCTS-CONSTRAINT-2 PRECIPITATION-ECONOMY-CONSTRAINT ADSORPTION-PRODUCTS-CONSTRAINT ADSORPTION-CAPITAL-COST-CONSTRAINT EXTRACnON-CAPACITY-CONSTRAINT EXTRACTION-PRODUCTS-CONSTRAINT EMULSIFIED-SYSTEMS-CONSTRAINTS SMALL-DENSITY-DIFFERENCE-CONSTRAINT SOLUTE-DEGRADATION-CONSTRAINT ADSORPTION-CONTINUOUS-PROCESSING-CONSTRAIISiT ADSORPTION-ECONOMY-CONSTRAINT ADSORPTION-SOLVENT-CONSTRAINT)))

(make-constraint ’FASTER-FERMENTATION-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is faster fermentation desired”)) (COMMENT <#7f>whether faster fermenta­ tion desired”) (TYPE-OF-CONSTRAINT GLOBAL/LOCAL) (SUPERS-CON- STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this con­ straint from other constraints))))

(make-constraint ’FERMENTATION-TYPE-CONSTRAINT ’((QUESTION <#7f>Spec- ify the type of fermentation as aerobic or anerobic”) (COMMENT <#7f>type of fermenta­ tion as aerobic or anerobic”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CON- STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this con­ straint from other constraints))))

(make-constraint ’FILTRATION-^LOW-PROCESS-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is a slow process acceptable”)) (COMMENT <#7f>whether slow process is acceptable”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFER­ ENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint FLOCCULATION-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Does the microorganisms in the feed flocculate”)) (COMMENT whether microorgan­ isms flocculate”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide 160 the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’HIGH-CELL-DENSITY-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is achieving high cell density desirable”)) (COMMENT <#7f>whether high cell density desired”) (TYPE-OF-CONSTRAINT GLOBAL/LOCAL) (SUPERS-CON­ STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’INTRACELLULAR-PRODUCT-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is the product intracellular”)) (COMMENT <#7f>whether product is intracellular”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (spec­ ify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint LARGE-PROTEINS-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Does the feed contain large proteins”)) (COMMENT <#7f>whether feed contains large proteins”) (TYPE-OF-CONSTRAINT GLOBAL/LOCAL) (SUPERS-CON­ STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’LOW-SALT-LEVEL-DESALTING-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is desalting to low saltlevels being carried out”)) (COMMENT <#7f>whether desalting to low salt levels being done”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inher­ ited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’MEMBRANE-FILTRATION-PRODUCTS-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is the product one of viruses, bacteria, streptomyces, yeasts and mammalian cells”)) (COMMENT <#7f>whether product is one of vi­ ruses,bacteria, streptomyces, yeasts and mammalian cells”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inher­ ited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’MDCING-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is good mixing inside during fermentation desired”)) (COMMENT <#7f>whether good mixing de- 161 sired”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’MODE-OF-OPERATION-CONSTRAINT ’((QUESTION <#7f>Specify the mode of operation as continuous or batch”)(COMMENT <#7f>mode of operation continuous/batch”) (TYPE-OF-CONSTRAINT GLOBAL/LOCAL) (SUP­ ERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFER­ ENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’OPERATIONAL-DIFFICULTY-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is a difficult operation acceptable”)) (COMMENT <#7f>whether diffi­ cult operation acceptable”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS- CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ORGANIC-ACID-STREAM-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is an organic acid being produced”)) (COMMENT”is an organic acid being produced”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint PARTICLE-SIZE-CONSTRAINT ’((QUESTION (ASK-^ML? <#7f>Specify the product particle size”)) (COMMENT <#7f>specify the size of product particles”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the infer­ ence method to infer the value of this constraint from other constraints))))

(make-constraint ’PRECIPITATION-CHEMICAL-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is a chemical to precipitatethe product available”)) (COMMENT <#7f>whether a chemical to precipitate the product available”) (TYPE-OF-CON­ STRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints)))) 162

(make-constraint ’PRECIPITATION-ECONOMY-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is the most economical operation desired”))(COMMENT <#7f>wheth­ er most economical operation desired”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFER­ ENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PRECIPITATION-PRODUCTS-CONSTRAINT-1 ’((QUESTION (ASK-YNU? <#7f>Is the product one of polysaccharide, lactic acid and protein”)) (COM­ MENT <#7 f> whether product one of poly saccharide,lactic acid and protein”) (TYPE-OF- CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to he inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PRECIPITATION-PRODUCTS-CONSTRAINT-2 ’((QUESTION (ASK-YNU? <#7f>Does the product contain penicillin”)) (COMMENT <#7f>whether product contains penicUIin”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CON­ STRAINT (specify the constraint whose value is to he inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’PRIMARY-RECOVERY-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is a primary recovery operation being done”)) (COMMENT <#7f>whether the operation is for primary recoveiy”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS- CONSTRAINT (specify the constraint whose value is to he inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’PRODUCT-PRIMARY-METABOLITE-CONSTRAINT ’((QUES­ TION (ASK-YNU? <#7f>Is the product a primary metabolite”)) (COMMENT <#7f>whether product is primary metabolite”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whosevalue is to he inherited)) (INFER­ ENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint PRODUCT-YEAST-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Does the product contain Yeast”)) (COMMENT <#7f>whether product contains yeast”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose 163 value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PROTEIN-DEGRADATION-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is dénaturation or degradation of protein acceptable”)) (COMMENT <#7f>whether protein denaturation/degradationacceptable”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inher­ ited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PROTEIN-DESALTING-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is a protein being desalted”)) (COMMENT <#7f>whether protein is being de­ salted”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PROTEIN-PURIFICATION-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is a protein being purified”)) (COMMENT <#7f>whether a protein is being pu­ rified”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint PURIFICATION-ANTIGENS-ANTIBODIES-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Are antigens or antibodiesbeing purified”)) (COM­ MENT <#7f>whether antigens or antibodies being purified”)(TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inher- ited))(INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’PURIFICATION-CAPACITY-CONSTRAINT ’((QUESTION (ASK- SML? <#7f>Specify the capacityrequirement”)) (COMMENT <#7f>specify purification capacity requirement”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CON­ STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’PURIFICATION-PROTEIN-CRUDE-EXTRACTS-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Are proteins from crudeextracts being purified”)) 164

(COMMENT <#7f>whether proteins from crude extracts being purified”)(TYPE-OF- CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint PURIFICATTON-PROTEIN-MIXTURES-CONSTRAINT '((QUES­ TION (ASK-YNU? <#7f>Are protein mixtures being separated”)) (COMMENT <#7f>whether protein mixtures are being separated”) (TYPE-OF-CONSTRAINT GLOB­ AL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (IN­ FERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’SEDIMENTATION-SLOW-PROCESS-CONSTRAINT ’((QUES­ TION (ASK-YNU? <#7f>Is a slow process acceptable”)) (COMMENT <#7f>whether a slow process is acceptable”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS- CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-ME­ THOD (provide the inference method to infer the value of this constraint from other con­ straints))))

(make-constraint ’SELECTED-FRACTIONATION-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is selected fractionation of proteins being carried out”)) (COMMENT ” whether selected fractionation ofjproteins being done”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inher­ ited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’SHEAR-FORCES-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Can large shear forces insidefermentor be tolerated”)) (COMMENT <#7f>whether large shear forces inside fermentor allowed”) (TYPE-OF-CONSTRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFER­ ENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’SIMPLE-OPERATION-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is a simple operation desired”)) (COMMENT <#7f>whether a simple operation is de­ sirable”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS-CONSTRAINT (speci­ fy the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints)))) 165

(make-constraint ’SMALL-DENSITY-DIFFERENCE-CONSTRAINT '((QUESTION (ASK-YNU? <#7f>Does the solutes differ verysmall density wise”)) (COMMENT <#7f>whether solutes differ very small density wise”) (TYPE-OF-CONSTRAINT GLOB­ AL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (IN­ FERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint 'SOLUTE-DEGRADATION-CONSTRAINT ’((QUESTION (ASK- YNU? <#7f>Is the solute under the risk of degradation due to heat, hydrolysis or enzyme reaction”)) (COMMENT <#7f>whether the solute can degrade”) (TYPE-OF-CON- STRAINT GLOBAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints))))

(make-constraint ’THERMAL-STERILIZATION-FEASIBILITY-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is thermal sterilization feasible for the product”)) (COMMENT <#7f>whether thermal sterUization feasible”) (TYPE-OF-CONSTRAINT RELATIONAL) (SUPERS-CONSTRAINT (specify the constraint whose value is to be in­ herited))

(make-constraint ’WASTE-STREAM-CONSTRAINT ’((QUESTION (ASK-YNU? <#7f>Is a waste stream being disinfected”)) (COMMENT <#7f>whether a waste stream is being disinfected”) (TYPE-OF-CONSTRAINT GLOBAL/LOCAL) (SUPERS-CON­ STRAINT (specify the constraint whose value is to be inherited)) (INFERENCE-METHOD (provide the inference method to infer the value of this constraint from other constraints)))) APPENDIX B. TRACE OF HCl

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! the following are evidences used in all the case studies the evidences are compiled from the domain knowledge mimiiiiimtiniiiiMiiintiiHmmtfMitimintft

el = normality of reactor temp is N e2 = normality of regen-air-flow is N e3 = normality of regen-temp is N e4 = normality of coke-make is N e5 = normality of h2-coke is N el2 = normality of stripper temp is N el3 = normality of mail-column-inlet-temp is N el4= normality of conversion is (NOT (OR L LL)) el6 = normality of ambient temp is {(OR H HH) (NOT (OR L LL))) el7 = the normality of regen-cat-slide-valve-press is (N, (NOT (NOT N))| el9 = normality of regen-cat-valve-purge is N e23=normality of regen-delta-T is N ;;e24=normality of o2 is N e26=normality of regen-grid-delta-P is (OR L LL N) e27=the normality of regen-delta-T is (OR L LL N) e29=the normality of o2 is (OR L LL N) e34=normality of blower-rpm is (OR H HH N) e36=normality of blower-steam-rate is (OR H HH N) e37=normality of vacuum-condenser-press is N e44=normality of ambient temp is (NOT (OR H HH)) e45=normality of blower-intake-delta-P is (NOT (OR H HH)) e46=normal operation of torch oil rate is F e47=the test of torch oil rate is F ;;e48=the state of torch oil control valve is O e49=the blind of torch oil line is T;;e35=history of regen-air-blower is (NOT T) e28=the coke of catalyst-test is (OR L LL N) ;e20 = function of regen-cat-slide-valve is T e22 = function of reactor-temp-controller e31=function of air-flow-meter is T el8 = hydraulics of regen-cat-slide-valve is {(NOT F) (NOT U) T | e21 = setting of reactor-temp is F

1 6 6 167

e32=setting of regen-air-flow is F el5=state of yields is (NOT T) e25=state of regen-cyclone-temp is F e30=the state of blower-vents is F e9 = trend of reactor-temp is S e40=trend of reactor-temp is (OR I I I D DD S P), i.e. (not N) e41=trend of main column intelet temperature is (OR III D DD S P), i.e. (not N) e42=state of reactor temperature controller is (NOT F) or T elO = the position of regen-cat-slide-valve is (NOT (NOT normal)) el 1= the state of regen-cat-valve-deltap is (NOT (> 1)) e6 = observation of stripper is F e7 = observation of regenerator is F e38=observation of regen-air-blower is F e39=maintenance of regen-air-blower is F e8 = distribution of regen-temp is F e33=capacity of regen-air-blower is (OR H HH N) e43=the performance test of regen air blower is (NOT F) or T ininiinimiiiimiiimiiiiimmmmmiitiiiim Following is the trace of HCl for the above input data

Input data for case-study#3 iiimtmmiimrtiiiiiiiiminitmiiimmüiiiiii

(defun init-evidence-table () (let ()

(setf root ’x) (setf (get ’x ’evidence-set) ’(ell)) (setf classes ’(wear motor surface-condensers aflow-meter vent-open regen-grid rtemp-controller rtemp-setpoint hydraulic-system regen-cat-slide-valve instrumentation dirt ambient-conditions intake-plugging torchoil-flow blind-leak))

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! following are evidence sets of classes iiniiitmtmiimiimniiittiimimiimiiimiitii

(setf (get ’instrumentation ’evidence-set) ’(el e2e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e40 e41 e42))

(setf (get ’hydraulic-system ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 el6 el7 el8)) 168

(setf (get ’regen-cat-slide-valve ’evidence-set) ’(el e2e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 el9 e20))

(setf (get ’rtemp-setpoint ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e21))

(setf (get ’rtemp-controUer ’evidence-set) ’(el e2e3 e4 e5 e6 e7 e8 e9 elO ell el2 el3 el4 el5 e22))

(setf (get ’regen-grid ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e27 e28 e29))

(setf (get ’vent-open ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e30))

(setf (get ’aflow-meter ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e31))

(setf (get ’motor ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e33 e34 e35))

(setf (get ’wear ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e34 e33 e38 e39))

(setf (get ’surface-condensers ’evidence-set) ’(el e2 e3 e4 e5 e6 e7 e8 e23 e24 e25 e26 e34 e36 e37))

(setf (get ’dirt ’evidence-set) ’(el e2 e3 e4 e5 e6 el e8 e23 e24 e25 e26 e43))

(setf (get ’ambient-conditions ’evidence-set) ’(el e2 e3 e4 e5 e6 el e8 e23 e24 e25 e26 e44))

(setf (get ’intake-plugging ’evidence-set) ’(el e2 e3 e4 e5 e6 el e8 e23 e24 e25 e26 e45))

(setf (get ’torchoil-flow ’evidence-set) ’(el e2 e3 e4 e5 e6 el e8 e28 e46 e47 e48))

(setf (get ’blind-leak ’evidence-set) ’(el e2 e3 e4 e5 e6 el e8 e28 e46 e47 e49))))

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ; following is the trace of HCl for the above input data

• •ttm iiniiim M im iim im im tttm iiitiim m iit

(Allegro CL 3.1.13.1 [Sun4] (0/0/0) Copyright (C) 1985-1990, Franz Inc., Berkeley, CA, USA

(load “new-hc.cl”) 169

; Loading /n/nervous/l/murthy/leaming/new-hc.cl.

T (start)

; Loading /n/nervous/l/murthyAeaming/hc-casel.cl. type b for breadth (default is depth)b specific classes = NIL the most specific class of TORCHOIL-FLOW in the hierarchy is NIL TORCHOIL-FLOW has no relationship with any of the other classes; so its parent is root

class proximity of BLIND-LEAK, TORCHOIL-FLOW = 3 class proximity of BLIND-LEAK, X = 0 specific classes = (TORCHOIL-FLOW) the most specific class of BLIND-LEAK in the hierarchy is TORCHOIL-FLOW

Give the name of the anchonal

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ;;Evidence set of A1 is created

• •tm iiiM m iiim iiiiiim m im iim m nim tntni the parent of BLIND-LEAK is: A1 and its evidence-set is (E47 E46 E28) the parent of TORCHOIL-FLOW is: A1 the parent of A1 is: X

AFLOW-METER has no relationship with any of the other classes; so its parent is root the most specific class of VENT-OPEN in the hierarchy is AFLOW-METER

Give the name of the anchor: a2

••fiinim ttttiiiiM iiM iim itiiiiiiiHiinm m tim i ;;Evidence set of A2 is created

..tm tm tm tit nil m lit ttm u m in t tm itttttm m

the parent of VENT-OPEN is: A2 and its evidence-set is (E26 E25 E24 E23) the parent of AFLOW-METER is: A2 the parent of A2 is: X 170

DIRT specific classes = (AFLOW-METER VENT-OPEN A2) the most specific class of DIRT in the hierarchy is A2 Give the name of the anchor: a3

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Evidence set of A3 is created

the parent of DIRT is: A3 and its evidence-set is (E26 E25 E24 E23) the parent of A2 is: A3 the parent of A3 is: X

AMBIENT-CONDITIONS specific classes = (AFLOW-METER VENT-OPEN A2 DIRT A3) the most specific class of AMBIENT-CONDITIONS in the hierarchy is A3 Give the name of the anchor: a4

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Evidence set of A4 is created tmmnittmtmmtmttmmttttttmmiitttmi the parent of AMBIENT-CONDITIONS is: A4 and its evidence-set is (E26 E25 E24 E23) the parent of A3 is: A4 the parent of A4 is: X

INTAKE-PLUGGING specific classes = (AFLOW-METER VENT-OPEN A2 DIRT A3 AMBIENT-CONDI­ TIONS A4) the most specific class of INTAKE-PLUGGING in the hierarchy is A4 Give the name of the anchor: a5

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Evidence set of A5 is created tifiiiimMtiimiiiinimimiiiiimiHiimniiii? the parent of INTAKE-PLUGGING is: A5 and its evidence-set is (E26 E25 E24 E23) the parent of A4 is: A5 171 the parent of A5 is: X

MOTOR specific classes = (AFLOW-METER VENT-OPEN A2 DIRT A3 AMBIENT-CONDI­ TIONS A4 INTAKE-PLUGGING A5) the most specific class of MOTOR in the hierarchy is A5 Give the name of the anchor: a6

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ;;Evidence set of A6 is created • •immnmiimmiiiniiiiiniimmiitnitiiimi!

the parent of MOTOR is: A6 and its evidence-set is (E26 E25 E24 E23) the parent of A5 is: A6 the parent of A6 is: X

SURFACE-CONDENSERS specific classes = (MOTOR) the most specific class of SURFACE-CONDENSERS in the hierarchy is MOTO Give the name of the anchor: a7

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! "Evidence set of A7 is created

the parent of SURFACEXZONDENSERS is: A7 the parent of MOTOR is: A7 the parent of A7 is: A6 and its evidence-set is (E34 E26 E25 E24 E23)

REGEN-GRID specific classes = (AFLOW-METER VENT-OPEN A2 DIRT A3 AMBIENT-CONDI­ TIONS A4 INTAKE-PLUGGING A5 MOTOR A6 SURFACE-CONDENSERS A7)

the most specific class of REGEN-GRID in the hierarchy is A6 Give the name of the anchor: a8

••iiininM m iiiim iiim iniuiM iiittniim M intn "Evidence set of A8 is created 172 the parent of REGEN-GRID is: A8 and its evidence-set is (E26 E25 E24 E23) the parent of A6 is: A8 the parent of A8 is: X

WEAR specific classes = (MOTOR)

the most specific class of WEAR in the hierarchy is MOTOR Give the name of the anchor: a9

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ;;Evidence set of A9 is created ■ •miimiimmimiimmiiiiiiiitiimmiiiiitmi the parent of WEAR is: A9 the parent of MOTOR is: A9 the parent of A9 is: A7 and its evidence-set is (E33 E34 E26 E25 E24 E23)

RTEMP-CONTROLLER RTEMP-CONTROLLER has no relationship with any of the other classes; so its parent is root

RTEMP-SETPOINT specific classes = (RTEMP-CONTROLLER) the most specific class of RTEMP-SETPOINT in the hierarchy is RTEMP-CONTROL­ LER Give the name of the anchor:alO

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ;;Evidence set of AlO is created

the parent of RTEMP-SETPOINT is: AlO and its evidence-set is (E15 E14 E13 E12 E ll ElO E9) the parent of RTEMP-CONTROLLER is: AlO the parent of AlO is: X

REGEN-CAT-SLIDE-VALVE

specific classes = (RTEMP-CONTROLLER RTEMP-SETPOINT AlO) 173

the most specific class of REGEN-CAT-SLIDE-VALVE in the hierarchy is A 10 Give the name of the anchor:all

;;!!!!!!!!!!!!!!!!!!!!?!?!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! "Evidence set of A ll is created

the parent of REGEN-CAT-SLIDE-VALVE is: A ll and its evidence-set is (E15 E14 E13E12E11 E10E9) the parent of AlO is: A11 the parent of A ll is: X

HYDRAULIC-SYSTEM specific classes = (RTEMP-CONTROLLER RTEMP-SETPOINT AlO REGEN-CAT- SLIDE-VALVE All) the most specific class of HYDRAULIC-SYSTEM in the hierarchy is A ll Give the name of the anchor:al2

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! "Evidence set of A12 is created ••imitiiimimiiimiimiiiiitmimtmnmiiiin

the parent of HYDRAULIC-SYSTEM is: AI2 and its evidence-set is (E15 EI4 E13 E12 Ell ElO E9) the parent of A ll is: A12 the parent of A12 is: X

INSTRUMENTATION specific classes = (RTEMP-CONTROLLER RTEMP-SETPOINT AlO REGEN-CAT- SLIDE-VALVE A ll HYDRAULIC-SYSTEM A12)

the most specific class of INSTRUMENTATION in the hierarchy is A12 Give the name of the anchor:al3

;;!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ;;Evidence set of A13 is created ••nuiimtiimMMiifiiniiitiinitiMimimiiMiiii 174 the parent of INSTRUMENTATION is: A13 and its evidence-set is (E15 E14 E13 EI2 E ll ElOE9) the parent of A12 is: A13 the parent of A13 is: X the hierarchy is as follows the children of X are (A13 A8 Al) the children of A13 are (A12 INSTRUMENTATION) the children of A12 are (All HYDRAULIC-SYSTEM) the children of A ll are (AlO REGEN-CAT-SLIDE-VALVE) the children of AlO are (RTEMP-CONTROLLER RTEMP-SETPOINT) the children of A8 are (A6 REGEN-GRID) the children of A6 are (A7 A5) the chüdren of A7 are (A9 SURFACE-CONDENSERS) the children of A9 are (WEAR MOTOR) the children of A5 are (A4 INTAKE-PLUGGING) the children of A4 are (A3 AMBIENT-CONDITIONS) the children of A3 are (A2 DIRT) the children of A2 are (AFLOW-METER VENT-OPEN) the children of Al are (TORCHOIL-FLOW BLIND-LEAK)

:exit ; Exiting Lisp APPENDIX C.

GENETIC ALGORITHM FOR MULTICOMPONENT SEPARATION

Multicomponent separations have been discussed extensively by Thompson and King

(1972), Rodrigo and Seader (1975), Gomez and Seader (1976), Seader and Westerberg

(1977), Pibouleau et al. (1983), Westerberg (1985) and Meszaros and Fonyo (1986).

Westerberg (1985) summarizes several heuristics and available methods for solving the multicomponent separation problem.

All these accounts in the literature present the multicomponent separation problem where the components are separated by distillation, with the exception of Thompson and King (1972).

Thompson and King refer to alternative separations like extractive distillation, absorption and stripping. A useful equation in the Thompson and King paper is the number of possible sequences for an N component mixture with S different types of separation processes for each constituent step in the sequence given as:

[2(n-l)!]/[n!(n-I)!]*S<'^-^>

When N=7 and S=10 there are 132 billion possible separation sequences. However, they conclude very early that ‘it is infeasible to examine individually all of the possibilities, even with a computer.’

Clearly, separation of multicomponent mbtture with alternative separation processes is a combinatorial problem. But, Thompson and King do not. observe that most of the possibilities are not even ‘feasible,’ i.e. possibilities that can satisfy simple feasibility constraints like “Process P cannot be used on a mixture of [A B C] to separate A from B and

175 176

C.” In this context, even optimization methods like the ordered branch search by Gomez and

Seader (1975) can propose optimal sequences, but cannot determine if the optimal sequence computed by their method can really be implemented. An implicit assumption in their method is that it is possible to separate components by distillation alone.

In sum, the existing methods for finding optimal multicomponent separation sequences do not apply the feasibility tests and implicitly assume distillation is applicable. As a result, these methods are less useful for open-ended separation problems, where general separation processes are to be considered and distillation cannot be assumed de facto.

With this background, a Genetic Algorithm procedure for a general multicomponent separation problem is described in the rest of this appendix. GA is proposed by Holland

(1975) for searching in large complicated search spaces. GA uses the metaphor of natural evolution and converges fast to globally optimal solutions. GA’s have been applied for design of communication networks (Davis and Coombs, 1987) and its statistical variant, simulated annealing (Kirkpatrick, et al., 1983), is applied to problems of combinatorial optimization lilce wire routing and VLSI design.

There are 4 important components in a GA:

1. a chromosomal representation of solutions to the problem

2. a way to create an initial population of solutions

3. an evaluation function that plays the role of the environment, rating solutions in terms of

their fitness, and

4. genetic operators that alter the composition of children during reproduction 177

A simple GA procedure is given as (Grefenstette, 1987): procedure GA;

begin

initialize population P(0);

evaluate P(0);

t=l;

repeat

select P(t) from P(t-l);

recombine P(t);

evaluate P(t);

until (termination condition);

end.

To illustrate the GA procedure, consider a 6 component mixture of [twigs, gold, water, diamonds, sand, iron]. The following processes/tools are available for separating the components:

(a) Settling to separate Water and Twigs

(b) Magnet to separate Iron

(c) Gravimetry to separate sand

(d) Aurametry to separate gold

The problem is to find all the feasible separation sequences to separate the multicomponent mixture into pure components. A feasible separation sequence is represented as some combination of the 6 chromosomes: W (water), G (gold), S (sand), I (iron), D (diamonds) and

T (twigs) subject to the following feasibility test constraints: 178

1. Iron cannot be separated from water and sand using magnet

2. Diamonds cannot be separated from iron, water, twigs and sand

3. Sand cannot be separated from water by gravimetry

4. Gold cannot be separated from sand, water, twigs and iron by aurametry

5. Finally, water and twigs can be separated anytime by settling

For instance, a feasible separation sequence for this problem as a combination of the 6 chromosomes is: [W T S I G D] by using settling, gravimetry, magnet, and aurametry in that order. The question is: what are the other feasible sequences?

Note here that there are 6! = 720 possible ways of arranging the chromosomes to make up a separation sequence (this is equivalent to what Thompson and King say as possible separation sequences), but most of these are infeasible, i.e. cannot satisfy the feasibility test constraints. So how can we find the feasible separation sequences out of the 720 possible sequences? (Obviously, generate and test cannot be used.)

Using the GA procedure we assign P(0) = (S 1G W D T), the individual chromosomes. Next we generate the P(l) strings by crossing-over P(0) with P(0) as:

P(l) = [(S 1) (G W) (D T) (S G) (S D) (S T)....]

We then evaluate the strings within P(2) and reject several strings like (G W), (DT) because they violate the feasibility test constraints#4 and #2. So P(l) now looks like:

P(1) = [(S 1) (SG)(S D) ...J

Since P(l) is not the solution we are seeking, we generate P(2) by crossing over P(l) with

P(0):

P(2) = [(G S 1) (T S 1) (G S T) (D S G) (T S G) (W S D)...]

Once again we prune strings (G S T), (D S G)...that violate the feasibility test constraints, so

P(2) finally looks like: 179

P(2) = [(T S I) (T S G) (W S D)...]

Next to generate P(3) there are two ways: (i) cross-over P(l) with P(l) and (ii) cross-over

P(0) with P(2). So we generate P(3) and prune strings by applying constraints. Similarly we

generate P(4) by crossing over: (i) P(3) with P(0) and (ii) P(2) with P(l); P(5) by crossing

over: (i) P(4) with P(0), (ii) P(3) with P(l), and (iii) P(2) with P(2); and remember to prune

strings after each cross-over. So, finally, P(5) looks like:

P(5) = [(W T S I G D) (W S T I G D) (T W S I G D) (W S I T G D)...]

P(5) corresponds to the solution we are seeking because: (i) each of its strings has all the

chromosomes and (ii) each string is evaluated by the feasibility test constraints and found to

be satisfactory.

Thus GA offers a convenient knowledge-based way by using feasibility-test constraints to

generate feasible separation sequences. We now show this for the separation of hydrogen (H)

from nitrogen (N), ammonia (A), methane (M), sulfur-dioxide (S) and water (W). The

following processes are known:

(a) Absorption to separate ammonia

(b) Absorption to separate sulfur-dioxide

(c) Drying to separate water

(d) Membrane separation to separate nitrogen

(e) Membrane separation to separate hydrogen

(f) Membrane separation to separate methane

And the feasibility test constraints are:

1. Hydrogen cannot be separated in the presence of methane and nitrogen

2. Nitrogen, hydrogen and methane cannot be separated in the presence of ammonia,

sulfur-dioxide and water

3. Water caruiot be separated in the presence of ammonia and sulfur-dioxide 180

By following the GA procedure we generate P(0), P(l), P(2)...P(5) strings by crossing over, and pruning strings using the feasibility test constraints. Finally, the solution P(5) looks like:

P(5) = [(S A W N M H) (S A W M N H) (A S W N M H)...]

The suggested separation sequence by the string (S A W N M H) is: (i) first separate sulfur-dioxide from the rest by absorption, (ii) next, separate ammonia by absorption, (iii) followed by, water using drying, (iv) then, nitrogen by membrane separation, (v) next, methane by membrane separation, and (vi) finally, hydrogen. Similarly, the feasible sequences corresponding to the rest of the strings are generated.

To find which of these feasible sequences are best, it is straightforward. Using conventional cost estimation methods (based on flow rates, energy costs, equipment costs, etc.) the determination of the best sequence can be made.

In conclusion, some new insights that can be drawn from this discussion are: (i) there are far more possible sequences than feasible sequences in multicomponent separation, (ii) GA can generate feasible sequences by testing for the feasibility of a sequence using knowledge-based constraints, and (iii) an issue not addressed here is global optimality; a globally optimal sequence is also a feasible sequence in itself, so by evaluating the feasible sequences for lowest cost the globally optimal sequence can be determined. APPENDIX D. CBR KNOWLEDGE BASE

Following is the definition of the hierarchy Top is called Process. The children of process are: 1) Ammonia Synthesis 2) Coal gasification The children of ammonia synthesis are: 1) Claude Process 2) Haber-bosch process 3) Mont-Cenis process All these processes are represented as a classification hierarchy Constraints for selection of these processes are also represented Establish-Refine is the procedure for selection

(make-class ’process-cases '((type case) (children ammonia-synthesis-cb coal-gasification-cb) (parent top) (selection-constraints nil) (establish-rules nil) (reject-rales nil) (established? nil) (comment “This is the top node of the hierarchy. All top nodes wUl have the symbol ‘top’ as their parent’’)))

(make-class ’ammonia-synthesis-cb ’((type case) (children claude-process mont-cenis-process haber-bosch-process)(parent process-cases) (selection-constraints ammonia-con) (establish-mles ((satisfied))) (reject-rules nil) (established? nil) (comment “This is the case base for theammonia synthesis processes’’)))

(make-class ’coal-gasification-cb ’((type case) (children nil) (parent process-cases)(seIection-constraints coal-con) (establish-rules ((satisfied))) (reject-rules. nil) (established? nil) (process-description coal-gasification-flowsheet-1) (comment “This is the case base for thecoal gasification’’)))

(make-class ’claude-process ’((type case) (parent ammonia-synthesis-cb) (children nil) (to-make output from input) (selection-constraints cp-pressure-con cp-temperature-con cp-conversion-con)(establish-rules ((satisfied satisfied satisfied))) (reject-rules nil) (established? nil) (process-descriptionclaude-process-flowsheet)))

(make-class ’haber-bosch-process ’((type case) (parent ammonia-synthesis-cb) (to-make output from input) (children nil) (selection-constraints hb-pressure-conhb-temperature-con hb-conversion-con)(establish-rules ((satisfied

181 182 satisfied satisfied))) (reject-rules nil) (established? nil) (process-descriptionhaber-bosch-process-flowsheet)))

(make-class ’mont-cenis-process ’((type case) (inputs mixture hydrogen nitrogen) (outputs ammonia) (to-make output from input) (selection-constraints me—pressure-con mc-temperature—conmc-conversion-con) "establish rules basically say that all the constraints should be "satisfied. Special cases can be handled later. Similarly, reject ;;mles specify when to reject an object based on constraint ;;satisfaction. Note that by defaul not established means the object ;;is rejected. However, sometimes an explicit reject test may be "required. (establish-mles ((satisfied satisfied satisfied))) (reject-mles nil) (established? nil) (parent ammonia-synthesis-cb) (children nU) (selection-constraints nil) (process-description mont-cenis-process-flowsheet)))

o f hlCTSXCliy dSAnitlOH^^

#<#<** ****** ****** ********************* ******** ;;;This section contains the selection constraints

(make-class ’ammonia-con ’((type constraint) (variable ammonia-process “Is the process required to make ammonia product?’’)(test satisfy-yes) (comment “This is the constraint on theproduct specification of the ammonia process’’)))

(make-class ’coal-con ’((type constraint) (variable coal-gas-process “Is the process required to gasify coal?’’)(test satisfy-yes) (comment “This is the constraint on theproduct specification of coal gasification’’)))

;;;Claude process selection constraints

(make-class ’cp-pressure-con ’((type constraint) (variable convertor-pressure kPa) (test range 91000 101000 kPa) (comment “This is the constraint on theoperating pressure of the Claude process’’)))

(make-class ’cp-temperature-con ’((type constraint)(variable convertor-temperature degreeK) (test range 773 923 degreeK) (comment “This is the constraint on theoperating temperature of the claude process’’)))

(make-class ’cp-conversion-con ’((type constraint) (variable convertor-conversion number) (test range 0.6 0.7);;variables specify what variable is tested (comment “This is the constraint on thesingle pass conversion of the claude process”)))

;;;Mont-Cenis process selection constraints 183

(make-class ’mc-pressure-con ’((type constraint) (variable convertor-pressure kPa)(test range 9900 10100 kPa) (comment “This is the constraint on theoperating pressure of the mont-cenis process’’)))

(make-class ’mc-temperature-con ’((type constraint)(variable convertor-temperature degreeK) (test range 673 698 degreeK) (comment “This is the constraint on theoperating temperature of the mont-cenis process’’)))

(make-class ’mc-conversion-con ’((type constraint)(variabIe convertor-conversion number) (test range 0.09 0.2) (comment “This is the constraint on thesingle pass conversion of the mont-cenis process’’)))

;;; Haber-Bosch Process selection constraints

(make—class ’hb-pressure-con ’((type constraint) (variable convertor-pressure kPa) (test range 20000 35000 kPa) (comment “This is the constraint on theoperating pressure of the haber-bosch process’’)))

(make-class ’hb-iemperature-con ’((type constraint) (variable convertor-temperature degreeK) (test range 820 830 degreeK) (comment “This is the constraint on theoperating temperature of the haber-bosch process’’)))

(make-class ’hb-con vers ion-con ’((type constraint)(variable convertor-conversion number) (test range 0.15 0.2 number) (comment “This is the constraint on thesingle pass conversion of the haber-bosch process’’)))

;;;This section contains the definitions of the process flowsheets

;;;Claude Process

(make-class ’claude-process-flowsheet ’((type flowsheet) (structure cp-conv-1 cp-conv-2 cp-conv-3 cp-he-1 cp-he-2 cp-he-3 cp-cw-1 cp-cw-2 cp-cw-3 cp-ar-1 cp-ar-2 cp-ar-3) (input mixture nitrogen hydrogen) (output ammonia) (first cp>-he-l ) (comment “This is the processstructure of claude process. cp-he-I is the first structure for print-structure function’’)))

(make-class ’cp-he-1 ’((type heat-exchanger) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture hydrogen nitrogen) (input-1 -connection feed) (output-1 mixture hydrogen nitrogen) (output-1-connection cp-conv-1) (input-2 mixture nitrogen hydrogen ammonia) (input-2-connection cp-conv-1) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection cp-cw-1) (function heat-exchanger) (comment “This is a heat exchanger between the feedand the product from convertor 1’’)))

(make-class ’cp-he-2 ’((type heat-exchanger) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-1-connection 184 cp-ar-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-conv-2) (input-2 mixture nitrogen hydrogen ammonia) (input-2-connection cp-conv-2) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection cp-cw-2) (function heat-exchanger) (comment “This is a heat exchanger between the feedfrom ammonia receiver 1 and product from convertor 2”)))

(make-class ’cp-he-3 ’((type heat-exchanger) (inputs input-1 input-2) (outputs output—1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection cp-ar-2) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-conv-3) (input-2 mixture nitrogen hydrogen ammonia) (input-2-connection cp-conv-3) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection cp-cw-3) (function heat-exchanger) (comment This is a heat exchanger between the feedfrom ammonia receiver 2 and product from convertor 3’’)))

(make-class ’cp-conv-1 ’((type convertor) (inputs input-1) (outputs output-1) (input-1 mixture hydrogen nitrogen) (input-1 -connection cp-he-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-he-1) (function reaction reactor convertor) (comment “This is the ammonia convertor 1’’))) (make-class ’cp-conv-2 ’((type convertor) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-1 -connection cp-he-2) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-he-2) (function reaction reactor convertor) (comment “This is the ammonia convertor 2’’)))

(make-class ’cp-conv-3 ’((type convertor) (inputs input-1) (outputs output-1 ) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection cp-he-3) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-he-3) (function reaction reactor convertor) (comment “This is the ammonia convertor 3’’))) (make-class ’cp-cw-1 ’((type cooler) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-1 -connection cp-he-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-ar-1) (function cooling heat-exchanger) (comment “This is the water cooler for the product from convertor 1’’)))

(make-class ’cp-cw-2 ’((type cooler) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-1 -connection cp-he-2) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection cp-ar-2) (function cooling heat-exchanger) (comment “This is the water cooler for the product fromconvertor 2’’)))

(make-class ’cp-cw-3 ’((type cooler) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection cp-he-3) (output-1 partial-ammonia-product) (output-l-connection cp-ar-3) (function cooling heat-exchanger) (comment “This is the water cooler for the product fromconvertor 3’’)))

(make-class ’cp-ar-1 ’((type cooler) (inputs input-1) (outputs output-1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-1—connection cp-he-1) (output-1 185 ammonia) (output-l-connection product) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection cp-he-2) (function heat-exchanger) (comment “This is the ammonia receiver for convertor 1”)))

(make-class ’cp-ar-2 ’((type cooler) (inputs input-1) (outputs output-1 output-2) (input-1 mixture ammonia nitrogen hydrogen) (input-l-connection cp-he-2) (output-1 ammonia) (output-1 -connection product) (output-2 mixture ammonia nitrogen hydrogen) (output-2-connection cp-he-3) (function receiver storage) (comment “This is the ammonia receiver for convertor 2’’)))

(make-class ’cp-ar-3 ’((type cooler) (inputs input-1 ) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection cp-cw-3) (output-1 ammonia) (output-l-connection product) (function heat-exchanger) (comment “This is the ammonia receiver for convertor 3’’)))

;;;Mont-Cenis Process Flowsheet

(make-class ’mont-cenis-process-flowsheet ’((type flowsheet) (structure mc-conv-1 mc-he-1 mc-he-2 mc-comp-1 mc-cw-1 mc-sep-1 mc-ref-1 mc-purge mc-mix-1 mc-split-1) (input mixture nitrogen hydrogen) (output ammonia) (first cp-he-1) (comment “This is the processstructure of mont-cenis process’’)))

(make-class ’mc-conv-1 ’((type convertor) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection mc-he-1 ) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection mc-mix-1) (function chemical-reaction reaction) (comment “This is the ammonia convertor’’)))

(make-class ’mc-mix-1 ’((type mixer) (inputs input-1 input-2) (outputs output-1) (input-1 mixture nitrogen hydrogen) (input-l-connection feed) (input-2 mixture nitrogen hydrogen ammonia) (input-2-connection mc-conv-1) (output-1 mixture nitrogen hydrogen ammonia) (output-1 -connection mc-cw-1) (function mixing-device mixer) (comment “This is a mixing device’’)))

(make-class ’mc-split-1 ’((type splitter) (inputs input-1) (outputs output-1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection mc-sep-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection mc-he-2) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection purge) (function flow-splitter splitter) (comment “This is a flow splitter’’)))

(make-class ’mc-sep-1 ’((type separator) (inputs input-1) (outputs output-1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection mc-ref-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection mc-split-1) (output-2 ammonia) (output-2-connection product-storage) (function ammonia-separation separation absorption) (comment “This separated ammonia from ammonia, nitrogen and hydrogen mixture’’))) 1 8 6

(make-class ’mc-he-2 ’((type heat-exchanger) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection mc-split-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection mc-comp-1) (function heat-exchanger) (comment “This is a heat exchanger between the ammoniaseparator and the compressor’’)))

(make-class ’mc-he-1 ’((type heat-exchanger) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-l-cormection mc-comp-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection mc-conv-1) (function heat-exchanger) (comment “This is a heat exchanger between thecompressor and the ammonia convertor”)))

(make-class ’mc-comp-1 ’((type compressor) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen ammonia) (input-1 -connection mc-he-2) (output-1 mixture nitrogen hydrogen ammonia) (output-1 -connection mc-he-1) (function compression compressor) (comment “This is a compressor in the feed recycle loop”)))

;;;Haber-Bosch process flowsheet definition

(make-class ’haber-bosch-process-flowsheet ’((type flowsheet) (structure hb-conv-1 hb-comp-1 hb-abs-1 hb-mix-1 hb-split-1) (input mixture nitrogen hydrogen) (output ammonia) (comment “This is the processstructure of haber-bosch process”)))

(make-class ’hb-conv-1 ’((type convertor) (inputs input-1) (outputs output-1) (input-1 mixture nitrogen hydrogen) (input-l-connection hb-mix-1) (output-1 mixture hydrogen nitrogen ammonia) (output-1 -connection hb-comp-1) (function reaction convertor) (comment “This is the ammonia convertor for haber boschprocess”)))

(make-class ’hb-mix-1 ’((type mixer) (inputs input-1 input-2) (outputs output-1) (input-1 mixture nitrogen hydrogen) (input-l-connection feed) (input-2 mixture nitrogen hydrogen ammonia) (input-2-connection hb-split-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection hb-conv-1) (function mixing-device mixer) (comment “This is a mixing device”)))

(make-class ’hb-split-1 ’((type splitter) (inputs input-1) (outputs output-1 output-2) (input-1 mixture nitrogen hydrogen ammonia) (input-l-connection hb-abs-1) (output-1 mixture nitrogen hydrogen ammonia) (output-l-connection hb-mix-1) (output-2 mixture nitrogen hydrogen ammonia) (output-2-connection purge) (function flow-splitter splitter) (comment “This is a flow splitter”)))

(make-class ’hb-abs-1 ’((type absorber) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 water) (input-l-connection water-feed) (input-2 mixture hydrogen 187 nitrogen ammonia) (input-2-connection hb-comp-1) (output-1 mixture hydrogen nitrogen ammonia) (output-l-connection hb-split-1) (output-2 mixture ammonia water) (output-2-connection product) (function absorption separation) (comment “This is an ammonia absorber”)))

;;;coal gasification process flowsheet-1

(make-class 'coal-gasification-flowsheet-1 ’((type flowsheet) (structure cgl-cp-1 cgl-cg-1 cgl-sr-1) (input coal) (output mixture carbon-monoxide carbon-dioxide hydrogen-sulfide hydrogen nitrogen) (to-make output from input) (comment “This is the coal gasification process tomake carbon-monoxide, carbon-dioxide, nitrogen and hydrogen product”)))

(make-class ’cgl-cp-1 ’((type coal-preparation) (inputs input-1) (outputs output-1) (input-1 coal) (input-l-connection coal-feed) (output-1 coal) (output-l-connection cgl-cg-1) (function coal-preparation preparation) (comment “This is the coal preparation process”)))

(make-class ’cgl-cg-1 ’((type reaction coal-gasification) (inputs input-1 input-2 input-3) (outputs output-1) (input-1 coal) (input-l-connection cgl-cp-1) (input-2 steam) (input-2-connection steam-feed) (input-3 air) (input-3-connection air-feed) (output-1 mixture carbon-monoxide hydrogen-sulfidehydrogen nitrogen) (output-l-connection cgl-sr-1) (function coal-gasification gasification) (comment “This is the coal gasification process”)))

(make-class ’cgl-sr-1 ’((type shift-reactor) (inputs input-1) (outputs output-1) (product output-1) (input-1 mixture carbon-monoxide hydrogen-sulfidehydrogen nitrogen) (input-l-connection cgl-cg-1) (output-1 mixture carbon-monoxide hydrogen-sulfidehydrogen nitrogen) (output-l-connection product) (function shift-reaction reaction) (comment “This is the shift reactor to convertcarbon-monoxide into carbon-dioxide”)))

;;Rectisol process definition

(make-class ’rectisol-process ’((type generic rectisol) (to-separate hydrogen-sulfide) <(inputs input-1) (outputs output-1 output-2) (input-1 mixture hydrogen-sulfide carbon-dioxidecarbon-monoxide nitrogen hydrogen) (input-1 -connection feed) (output-1 mixture carbon-dioxide carbon-monoxidenitrogen hydrogen) (output-l-connection product) (output-2 hydrogen-sulfide) (output-2-connection hydrogen-sulfide-storage) (function hydrogen-sulfide-removal absorption separation) (comment “Rectisol removes hydrogen sulfide from amixture of gases”)) )

;;; Girbatol process definition

(make-class ’girbatol-process ’((type generic girbatol) (to-separate hydrogen-sulfide carbon-dioxide) (inputs input-1) (outputs output-1 output-2) (input-1 mixture 188

carbon-dioxide carbon-monoxidehydrogen-sulfide nitrogen hydrogen) (input-l-connection feed) (output-1 mixture nitrogen hydrogen carbon-monoxide) (output-l-connection product) (output-2 mixture hydrogen-sulfide carbon-dioxide) (output-2-connection hydrogen-sulfide-co2-storage) (function hydrogen-sulfide-and-carbon-dioxide-removalabsorption separation) (comment “Girbatol removes hydrogen sulfide and carbon dioxide from a mixture of gases”)))

;;;Methanation process defination

(make-class ’methanation-process '((type generic methanation) (inputs input-1) (outputs output-l)(to-malce methane) (input-1 mixture carbon-monoxide carbon-dioxide nitrogenhydrogen) (input-l-connection feed) (output-1 mixture methane nitrogen hydrogen) (output—1-connection product) (function methanation reaction) (comment “Makes methane from carbon monoxide and carbondioxide feed”)))

(make-class 'methane-absorber ’((type generic absorber) (inputs input-1) (outputs output-1 output-2)(to-separate methane) (input-1 mixture methane nitrogenhydrogen) (input-l-connection feed) (output-1 mixture nitrogen hydrogen) (output-1 -connection product) (output-2 methane) (output-2-connection nil) (function absorption separation) (comment “Separates methane from nitrogen, hydrogen andmethane feed”)))

;;penicUlin flowsheet (malce-class 'penicUlin-cb ’((type case) (children penicillin-process-1 penicUlin-process-2) (parent process-cases) (selection-constraints nil) (establish-rules ((satisfied))) (reject-rules nil) (established? nU) (comment “This is the case base for thepenicUlin processes”)))(make-class ’penicillin-process-1 ’((type case) (parent penicillin-cb)(to-malce output from input) (children nil) (selection-constraints nil) (establish-rules nU) (reject-rules nil) (established? nil) (process-descriptionpenicillin-process-l-flowsheet)))

;;;This section contains the definitions of the process flowsheets

;; ;Penicillin-process-l-flowsheet

(make-class ’penicillin-process-l-flowsheet ’((type flowsheet) (stmcture ppl-ferm-1 ppl-filter-1 pp 1-extract-1 ppl-strip-1 ppl-extract-2 ppl-strip-2 ppl-ciystal-1 ppl-dry-l)(input spores) (output penicillin) (first ppl-ferm-1) (comment “This is the processstructure for pencUlin”)))

(make-class ’ppl-ferm-1 ’((type fermentor) (inputs input-1) (outputs output-1 ) (input-1 spores) (input-1 -connection feed) (output-1 mixture penicillin biomass) (output-1 -coruiection ppl-fdter-1) (function fermentation) (corrunent “This is the fermentor to ferment the feedcontaining spores to make penicillin”)))

(make-class ’pp 1-filter-1 ’((type filter) (inputs input-1) (outputs output-1 output-2) (input-1 mixture penicillin biomass) (input-l-connection ppl-ferm-1) (output-1 189 mixture penicillin biomass) (output-l-connection ppl-extract-1) (output-2 biomass) (output-2-connection waste) (function filtration) (comment “This is a filter that removes the solidbiomass in the output from fermentor”)))

(make-class 'pp 1-extract-1 ’((type extraction-column) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture penicillin biomass) (input-l-connection ppl-fUter-l) (output-1 mixture penicillin biomass) (output-l-connection ppi-strip-1) (input-2 solvent) (input-2-connection feed) (output-2 spent-beer) (output-2-connection recovery) (function extraction) (comment “This is an extraction column to extractpenicUlin from the output coming from filtration”)))

(malfe-class 'ppl-strip-1 ’((type stripping-column) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture penicillin biomass) (input-l-connection ppl-extract-1) (output-1 mixture penicillin biomass) (output-l-connection ppl-extract-2) (input-2 base) (input-2-connection feed) (output-2 waste-solvent) (output-2-connection recovery) (function stripping) (comment “This is a stripping column to strippenicUlin from the output coming from extraction”)))

(make-class 'ppl-extract-2 ’((type extraction-column) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture penicillin biomass) (input-l-connection ppl-strip-1) (output-1 mixture penicillin biomass) (output-1 -connection ppl-strip-2) (input-2 solvent) (input-2-connection feed) (output-2 waste-water) (output-2-connection recovery) (function extraction) (comment “Tins is the second extraction column to extractpenicUlin from the output of stripping”)))

(make-class 'ppl-strip-2 '((type stripping-column) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture penicUlin biomass) (input-1 -connection ppl-extract-2) (output-1 mixture penicUlin solvent) (output-l-connection ppl-crystal-1) (input-2 base) (input-2-connection feed) (output-2 waste-solvent) (output-2-connection recovery) (function stripping) (comment “This is a stripping column to strippenicUlin from the output coming from extraction”)))(make-class 'ppl-crystal-1'((type crystallization-tank) (inputs input-1 ) (outputs output-1 ) (input-1 mixture penicUlin solvent) (input-l-connection ppl-strip-2) (output-1 mixture penicUlin solvent) (output-l-connection ppl-dry-1) (function crystallization) (comment “This is the crystallization tank to make penicUlin crystals from the output of stripping”)))

(make-class 'ppl-dry-1 ’((type dryer) (inputs input-1 ) (outputs output-1 output-2) (input-1 mixture penicUlin solvent) (input-l-connection ppl-crystal-1) (output-1 penicUlin) (output-1 -connection product) (output-2 waste-water) (output-2-connection recovery) (function drying) (comment “This is a dryer to make penicUlin crystalsfrom the output of crystallization”))) "penicUlin process 2(make-class ’penicUlin-process-2 '((type case) (parent penicUlin-cb)(to-make output from input) (chUdren nU) 190

(selection-constraints nil) (establish-rules nil) (reject-rules nil) (established? nil) (process-descriptionpenicillin-process-2-flowsheet)))

;;;This section contains the definitions of the process flowsheets

; ; ;PenicUlin-process-2-flowsheet

(make-class ’penicillin-process-2-flowsheet ’((type flowsheet) (structure pp2-ferm-l pp2-filter-l pp2-cool-l pp2-tank-l pp2-filter-l pp2-extract-l pp2-carbon-treat-l pp2-fUter-2 pp2-crystal-l pp2-filter-3 pp2-wash-l )(input spores) (output penicillin) (first ppl-ferm-1) (comment “This is the processstructure for pencillin”)))

(make-class ’pp2-ferm-l ’((type fermentor) (inputs input-1 input-2) (outputs output-1) (input-1 spores) (input-1 -connection feed) (input-2 mixture acid base nutrients) (input-2-connection feed) (output-1 mixture penicillin biomass) (output-l-connection pp2-fUter-l) (function fermentation) (comment “This is the fermentor to ferment the feedcontaining spores to make penicillin’’)))

(make-class ’pp2-filter-l ’((type fUter) (inputs input-1) (outputs output-1) (input-1 mixture penicillin biomass) (input-l-connection pp2-fenn-l) (output-1 mixture penicillin biomass) (output-l-connection pp2-cool-l) (function filtration) (comment “This is the filter after fermentor’’)))

(make-class ’pp2-cool-l ’((type cooler) (inputs input-1 ) (outputs output-1 ) (input-1 mixture penicillin biomass) (input-l-connection pp2-filter-l) (output-1 mixture penicillin biomass) (output-l-connection pp2-tank-l) (function cooling) (comment “This is the cooler before holding tank’’)))

(make-class ’pp2-tank-l ’((type tank) (inputs input-1) (outputs output-1) (input-1 mixture penicillin biomass) (input-1 -connection pp2-cool-l) (output-1 mixture penicillin biomass) (output-l-connection pp2-filter-2) (function holding-tank) (comment “This is the holding tank between upstream anddownstream’’)))

(make-class ’pp2-filter-2 ’((type filter) (inputs input-1 input-2) (outputs output-1 output-2) (input-1 mixture penicillin biomass) (input-l-connection pp2-tank-l) (input-2 water) (input-2-connection feed) (output-1 mixture penicUlin biomass) (output-l-connection pp2-extract-l) (output-2 mycelium) (output-2-connection recycle) (function fUtration) (comment “This is the primary filtration in thepurification section’’)))

(make-class ’pp2-extract-l ’((type extraction) (inputs input-1 input-2 input-3) (outputs output-1) (input-1 mixture penicUlin biomass) (input-l-connection pp2-fUter-2) (input-2 acid) (input-2-connection feed) (input-3 solvent) (input-3-connection feed) (output-1 mixture penicUlin biomass) (output-l-connection pp2-carbon-treat-l) (function fUtration) (comment “This is the extraction after fUtration’’))) 191

(make-class ’pp2-carbon-treat-l '((type carbon-treatment) (inputs input-1) (outputs output—1) (input-1 mixture penicillin biomass) (input-l-connection pp2-extract-l) (output-1 mixture penicillin biomass) (output-l-connection pp2-filter-3) (function carbon-treatment) (comment “This is the carbon treatment after extractionto remove solvent’’)))

(make-class ’pp2-filter-3 ’((type filter) (inputs input-1 input-2) (outputs output-1) (input-1 mixture penicillin biomass) (input-l-connection pp2-carbon-treat-l) (input-2 solvent) (input-2-connection feed) (output-1 mixture penicillin biomass) (output-l-connection pp2-crystal-l ) (function filtration) (comment “This is the carbon filtration after carbontreatment’’)))

(make-class ’pp2-crystal-l ’((type crystallization) (inputs input-1 input-2) (outputs output-1) (input-1 mixture penicillin biomass) (input-l-connection pp2-filter-3) (input-2 solvent) (input-2-connection feed) (output-1 mixture penicillin biomass) (output-1 -connection pp2-filter—4) (function crystallization) (comment “This is the crystallization process’’)))

(make-class ’pp2-filter-4 ’((type filter) (inputs input-1 input-2) (outputs output-1 ) (input-1 mixture penicillin biomass) (input-1 -connection pp2-crystal-l) (input-2 solvent) (input-2-connection feed) (output-1 mixture penicillin solvent) (output-l-connection pp2-wash-l) (fiinction filtration) (comment “This is the filtration afterciystallization’’)))