<<

Mathematics in Chemical Engineering 3

Mathematics in Chemical Engineering

Bruce A. Finlayson, Department of Chemical Engineering, University of Washington, Seattle, Washington, United States (Chap. 1, 2, 3, 4, 5, 6, 7, 8, 9, 11 and 12) Lorenz T. Biegler, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States (Chap. 10) Ignacio E. Grossmann, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States (Chap. 10)

1. Solution of Equations ...... 5 7.2. Initial Value Methods ...... 62 1.1. Matrix Properties ...... 5 7.3. Finite Difference Method ...... 63 1.2. Linear Algebraic Equations ...... 7 7.4. Orthogonal Collocation ...... 65 1.3. Nonlinear Algebraic Equations ... 10 7.5. Orthogonal Collocation on 1.4. Linear Difference Equations ..... 12 Finite Elements ...... 69 1.5. Eigenvalues ...... 12 7.6. Galerkin Finite Element Method .. 71 2. Approximation and Integration ... 13 7.7. Cubic B-Splines ...... 73 2.1. Introduction ...... 13 7.8. Adaptive Mesh Strategies ...... 73 2.2. Global Polynomial Approximation . 13 7.9. Comparison ...... 73 2.3. Piecewise Approximation ...... 15 7.10. Singular Problems and Infinite 2.4. Quadrature ...... 18 Domains ...... 74 2.5. Least Squares ...... 20 8. Partial Differential Equations .... 75 2.6. Fourier Transforms of Discrete Data 21 8.1. Classification of Equations ...... 75 2.7. Two-Dimensional Interpolation and 8.2. Hyperbolic Equations ...... 77 Quadrature ...... 22 8.3. Parabolic Equations in One 3. Complex Variables ...... 22 Dimension ...... 79 3.1. Introduction to the Complex Plane . 22 8.4. Elliptic Equations ...... 84 3.2. Elementary Functions ...... 23 8.5. Parabolic Equations in Two or Three 3.3. Analytic Functions of a Complex Dimensions ...... 86 Variable ...... 25 8.6. Special Methods for Fluid Mechanics 86 3.4. Integration in the Complex Plane .. 26 8.7. Computer Software ...... 88 3.5. Other Results ...... 28 9. Equations ...... 89 4. Integral Transforms ...... 29 9.1. Classification ...... 89 4.1. Fourier Transforms ...... 29 9.2. Numerical Methods for Volterra 4.2. Laplace Transforms ...... 33 Equations of the Second Kind .... 91 4.3. Solution of Partial Differential 9.3. Numerical Methods for Fredholm, Equations by Using Transforms ... 37 Urysohn, and Hammerstein 5. Vector Analysis ...... 40 Equations of the Second Kind .... 91 6. Ordinary Differential Equations as 9.4. Numerical Methods for Eigenvalue Initial Value Problems ...... 49 Problems ...... 92 6.1. Solution by Quadrature ...... 50 9.5. Green’s Functions ...... 92 6.2. Explicit Methods ...... 50 9.6. Boundary Integral Equations and 6.3. Implicit Methods ...... 54 Boundary Element Method ...... 94 6.4. Stiffness ...... 55 10. Optimization ...... 95 6.5. Differential – Algebraic Systems ... 56 10.1. Introduction ...... 95 6.6. Computer Software ...... 57 10.2. Gradient-Based Nonlinear 6.7. Stability, Bifurcations, Limit Cycles 58 Programming ...... 96 6.8. Sensitivity Analysis ...... 60 10.3. Optimization Methods without 6.9. Molecular Dynamics ...... 61 Derivatives ...... 105 7. Ordinary Differential Equations as 10.4. Global Optimization ...... 106 Boundary Value Problems ...... 61 10.5. Mixed Integer Programming ..... 110 7.1. Solution by Quadrature ...... 62 10.6. Dynamic Optimization ...... 121

Ullmann’s Modeling and Simulation c 2007 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim ISBN: 978-3-527-31605-2 4 Mathematics in Chemical Engineering

10.7. Development of Optimization 12. Multivariable Applied to Models ...... 124 ...... 135 11. Probability and Statistics ...... 125 12.1. State Functions ...... 135 11.1. Concepts ...... 125 12.2. Applications to Thermodynamics .. 136 11.2. Sampling and Statistical Decisions . 129 11.3. Error Analysis in Experiments .... 132 12.3. Partial Derivatives of All ...... 11.4. Factorial Design of Experiments and Thermodynamic Functions 137 Analysis of Variance ...... 133 13. References ...... 138

Symbols q flux Variables Q volumetric flow rate; rate of heat gener- a 1, 2, or 3 for planar, cylindrical, spher- ation for heat transfer problems ical geometry r radial position in cylinder ai acceleration of i-th particle in molecu- ri position of i-th particle in molecular dy- lar dynamics namics A cross-sectional area of reactor; R radius of reactor or catalyst pellet in thermody- Re Reynolds number namics S in thermodynamics Bi Biot number Sh Sherwood number Bim Biot number for mass t time c concentration T temperature Cp heat capacity at constant pressure u velocity Cs heat capacity of solid U internal energyy in thermodynamics Cv heat capacity at constant volume vi velocity of i-th particle in molecular dy- Co Courant number namics D diffusion coefficient V volume of chemical reactor; vapor flow Da Damkohler¨ number rate in distillation column; potential en- De effective diffusivity in porous catalyst ergy in molecular dynamics; specific E efficiency of tray in distillation column volume in thermodynamics F molar flow rate into a chemical reactor x position G in thermodynamics z position from inlet of reactor hp heat transfer coefficient H in thermodynamics Greek symbols J mass flux a thermal diffusivity k thermal conductivity; reaction rate con- δ Kronecker delta stant ∆ sampling rate ke effective thermal conductivity of ε porosity of catalyst pellet porous catalyst η viscosity in fluid flow; effectiveness kg mass transfer coefficient factor for reaction in a catalyst pellet K chemical equilibrium constant η0 zero-shear rate viscosity L thickness of slab; liquid flow rate in dis- φ Thiele modulus tillation column; length of pipe for flow ϕ void fraction of packed bed in pipe λ time constant in polymer flow mi mass of i-th particle in molecular dy- ρ density namics ρs density of solid M holdup on tray of distillation column τ shear stress n power-law exponent in viscosity for- µ viscosity mula for polymers p pressure Pe Peclet number Mathematics in Chemical Engineering 5

Special symbols for special structures. They are useful because | subject to they increase the speed of solution. If the un- : mapping. For example, h : Rn → Rm , knowns appear in a nonlinear fashion, the prob- states that functions h map real num- lem is much more difficult. Iterative techniques bers into m real numbers. There are m must be used (i.e., make a guess of the so- functions h written in terms of n vari- lution and try to improve the guess). An im- ables portant question then is whether such an iter- ∈ member of ative scheme converges. Other important types → maps into of equations are linear difference equations and eigenvalue problems, which are also discussed.

1. Solution of Equations 1.1. Matrix Properties Mathematical models of chemical engineering systems can take many forms: they can be sets A matrix is a set of real or complex numbers of algebraic equations, differential equations, arranged in a rectangular array. and/or integral equations. Mass and energy bal-   a11 a12 ... a1n ances of chemical processes typically lead to    a21 a22 ... a2n  large sets of algebraic equations:   A=  . . . .   . . . .  a11x1+a12x2 = b1 am1 am2 ... amn a21x1+a22x2 = b2 The numbers aij are the elements of the matrix Mass balances of stirred tank reactors may lead T to ordinary differential equations: A,or(A)ij = aij . The transpose of A is (A )= d y aji . = f [y (t)] d t The determinant of a square matrix A is     integral  a11 a12 ... a1n  Radiative heat transfer may lead to    a21 a22 ... a2n  equations:   A=  . . . .   . . . .  1  . . . .    y (x)=g (x)+λ K (x, s) f (s)ds an1 an2 ... ann 0 If the i-th row and j-th column are deleted, a Even when the model is a differential equa- new determinant is formed having n−1rowsand tion or integral equation, the most basic step in columns. This determinant is called the minor of the algorithm is the solution of sets of algebraic aij denoted as Mij . The cofactor Aij of the el- equations. The solution of sets of algebraic equa- ement aij is the signed minor of aij determined tions is the focus of Chapter 1. by A single linear equation is easy to solve for i+j either x or y: Aij =(−1) Mij y = ax+b The value of |A| is given by n n If the equation is nonlinear, |A| = aij Aij or aij Aij f (x)=0 j=1 i=1 it may be more difficult to find the x satisfying where the elements aij must be taken from a this equation. These problems are compounded single row or column of A. when there are more unknowns, leading to si- If all determinants formed by striking out multaneous equations. If the unknowns appear whole rows or whole columns of order greater in a linear fashion, then an important considera- than r are zero, but there is at least one determi- tion is the structure of the matrix representing the nant of order r which is not zero, the matrix has equations; special methods are presented here rank r. 6 Mathematics in Chemical Engineering  The value of a determinant is not changed if  0 if ji+1 terminant are all zero, the value of the deter- Block diagonal and pentadiagonal matrices also minant is zero. If the elements of one row or arise, especially when solving partial differential column are multiplied by the same constant, the equations in two- and three-dimensions. determinant is the previous value times that con- stant. If two adjacent rows (or columns) are in- QR Factorization of a Matrix. IfAisanm terchanged, the value of the new determinant is × n matrix with m ≥ n, there exists an m × m the negative of the value of the original determi- unitary matrix Q =[q1, q2, ...qm ] and an m × n nant. If two rows (or columns) are identical, the right-triangular matrix R such that A=QR. The determinant is zero. The value of a determinant QR factorization is frequently used in the ac- is not changed if one row (or column) is multi- tual computations when the other transforma- plied by a constant and added to another row (or tions are unstable. column). A matrix is symmetric if Singular Value Decomposition. If A is an × ≥ ≤ aij =aji m n matrix with m n and rank k n, con- sider the two following matrices. and it is positive definite if ∗ ∗ n n AA and A A T x Ax= aij xixj ≥0 × i=1j=1 An m m unitary matrix U is formed from the eigenvectors ui of the first matrix. for all x and the equality holds only if x =0. If the elements of A are complex numbers, A* U=[u1,u2,...um] denotes the complex conjugate in which (A*)ij An n × n unitary matrix V is formed from the =a*ij .IfA=A* the matrix is Hermitian. eigenvectors vi of the second matrix. The inverse of a matrix can also be used to solve sets of linear equations. The inverse is a V =[v1,v2,...,vn] matrix such that when A is multiplied by its in- verse the result is the identity matrix, a matrix Then the matrix A can be decomposed into with 1.0 along the main diagonal and zero else- A=UΣV ∗ where. where Σ is a k × k diagonal matrix with diagonal AA−1=I elements dii=σi>0 for 1 ≤ i ≤ k. The eigenval- T −1 ∗ 2 If A =A the matrix is orthogonal. ues of Σ Σ are σi . The vectors ui for k+1 ≤ i ≤ Matrices are added and subtracted element m and vi for k+1 ≤ i ≤ n are eigenvectors asso- ciated with the eigenvalue zero; the eigenvalues by element. 2 for 1 ≤ i ≤ k are σi . The values of σi are called A+B is aij +bij the singular values of the matrix A.IfA is real, Two matrices A and B are equal if aij = bij . then U and V are real, and hence orthogonal ma- Special relations are trices. The value of the singular value decompo- (AB)−1=B−1A−1, (AB)T =BT AT sition comes when a process is represented by a −1 linear transformation and the elements of A, aij, A−1 T = AT , (ABC)−1=C−1B−1A−1 are the contribution to an output i for a partic- A diagonal matrix is zero except for elements ular variable as input variable j. The input may along the diagonal. be the size of a disturbance, and the output is

a ,i=j the gain [1]. If the rank is less than n, not all the a = ii ij 0,i=j variables are independent and they cannot all be controlled. Furthermore, if the singular values A tridiagonal matrix is zero except for elements are widely separated, the process is sensitive to along the diagonal and one element to the right small changes in the elements of the matrix and and left of the diagonal. the process will be difficult to control. Mathematics in Chemical Engineering 7 1.2. Linear Algebraic Equations

Consider the n×n linear system

a11 x1+a12 x2+ ...+a1n xn = f1 a21 x1+a22 x2+ ...+a2n xn = f2 ...

an1 x1+an2 x2+ ...+ann xn = fn

In this equation a11,...,ann are known pa- rameters, f 1,...,fn are known, and the un- knowns are x1,...,xn . The values of all un- knowns that satisfy every equation must be found. This set of equations can be represented as follows: n aij xj = fj or Ax = f j=1 The most efficient method for solving a set of linear algebraic equations is to perform a lower – upper (LU) decomposition of the correspond- ing matrix A. This decomposition is essentially a Gaussian elimination, arranged for maximum efficiency [2, 3]. The LU decomposition writes the matrix as

A=LU The U is upper triangular; it has zero elements below the main diagonal and possibly nonzero values along the main diagonal and above it (see Figure 1. Structure of L and U matrices Fig. 1). The L is lower triangular. It has the value 1 in each element of the main diagonal, nonzero values below the diagonal, and zero values above If the value of the determinant is a very large or the diagonal (see Fig. 1). The original problem very small number, it can be divided or multi- can be solved in two steps: plied by 10 to retain accuracy in the computer; the scale factor is then accumulated separately. Ly = f, Ux = y solves Ax = LUx = f The condition number κ can be defined in terms Each of these steps is straightforward because of the singular value decomposition as the ratio dii dii the matrices are upper triangular or lower trian- of the largest to the smallest (see above). gular. It can also be expressed in terms of the norm of When f is changed, the last steps can be the matrix: done without recomputing the LU decomposi- κ (A)=AA−1 tion. Thus, multiple right-hand sides can be com- puted efficiently. The number of multiplications where the norm is defined as and divisions necessary to solve for m right-hand Ax n   A≡sup = max a  sides is: x=0 x k jk j=1 1 3 1 2 Operation count = n − n+mn 3 3 If this number is infinite, the set of equations The determinant is given by the product of is singular. If the number is too large, the ma- the diagonal elements of U. This should be cal- trix is said to be ill-conditioned. Calculation of culated as the LU decomposition is performed. 8 Mathematics in Chemical Engineering the condition number can be lengthy so another The LU decomposition algorithm for solving criterion is also useful. Compute the ratio of the this set is largest to the smallest pivot and make judgments b = b on the ill-conditioning based on that. 1 1 for k =2,ndo When a matrix is ill-conditioned the LU a a a = k ,b = b − k c k b k k b k−1 decomposition must be performed by using piv- k−1 k−1 oting (or the singular value decomposition de- enddo d = d scribed above). With pivoting, the order of the 1 1 elimination is rearranged. At each stage, one for k =2,ndo d = d − a d looks for the largest element (in magnitude); k k k k−1 the next stages if the elimination are on the row enddo x = d /b and column containing that largest element. The n n n largest element can be obtained from only the for k = n − 1, 1do d − c x x = k k k+1 diagonal entries (partial pivoting) or from all k b k the remaining entries. If the matrix is nonsingu- enddo lar, Gaussian elimination (or LU decomposition) could fail if a zero value were to occur along the The number of multiplications and divisions for diagonal and were to be a pivot. With full pivot- a problem with n unknowns and m right-hand ing, however, the Gaussian elimination (or LU sides is decomposition) cannot fail because the matrix Operation count =2(n−1) +m (3 n−2) is nonsingular. The Cholesky decomposition can be used for If real, symmetric, positive definite matrices. This algorithm saves on storage (divide by about 2) |bi|>|ai|+|ci| and reduces the number of multiplications (di- no pivoting is necessary. For solving two-point vide by 2), but adds n square roots. boundary value problems and partial differential The linear equations are solved by equations this is often the case. x = A−1f Generally, the inverse is not used in this way because it requires three times more operations than solving with an LU decomposition. How- ever, if an inverse is desired, it is calculated most efficiently by using the LU decomposition and then solving (i) (i) Ax = b 0 j=i b(i) = j 1 j = i Then set   A−1 = x(1)|x(2)|x(3)|···|x(n)

Solutions of Special Matrices. Special ma- Figure 2. Structure of tridiagonal matrices trices can be handled even more efficiently. A tridiagonal matrix is one with nonzero entries Sparse matrices are ones in which the major- along the main diagonal, and one diagonal above ity of elements are zero. If the zero entries oc- and below the main one (see Fig. 2). The corre- cur in special patterns, efficient techniques can sponding set of equations can then be written be used to exploit the structure, as was done as above for tridiagonal matrices, block tridiagonal aixi−1+bixi+cixi+1 = di Mathematics in Chemical Engineering 9 matrices, arrow matrices, etc. These structures where A is symmetric and positive definite, is typically arise from numerical analysis applied to be solved. A preconditioning matrix M is de- to solve differential equations. Other problems, fined in such a way that the problem such as modeling chemical processes, lead to Mt = r sparse matrices but without such a neatly defined structure—just a lot of zeros in the matrix. For is easy to solve exactly (M might be diagonal, matrices such as these, special techniques must for example). Then the preconditioned conju- be employed: efficient codes are available [4]. gate gradient method is These codes usually employ a symbolic factor- Guess x0 ization, which must be repeated only once for Calculate r0 = f − Ax0 each structure of the matrix. Then an LU factor- Solve Mt0 = r0, and set p0 = t0 ization is performed, followed by a solution step using the triangular matrices. The symbolic fac- for k =1,n (or until convergence) rTt torization step has significant overhead, but this a = k k k pTAp k k is rendered small and insignificant if matrices x = x +a p with exactly the same structure are to be used k+1 k k k over and over [5]. rk+1 = rk − akApk The efficiency of a technique for solving sets Solve Mtk+1 = rk+1 rT t of linear equations obviously depends greatly on b = k+1 k+1 k rTt the arrangement of the equations and unknowns k k because an efficient arrangement can reduce the pk+1 = tk+1+bkpk bandwidth, for example. Techniques for renum- test for convergence bering the equations and unknowns arising from enddo elliptic partial differential equations are avail- Note that the entire algorithm involves only able for finite difference methods [6] and for fi- matrix multiplications. The generalized mini- nite element methods [7]. mal residual method (GMRES) is an iterative method that can be used for nonsymmetric sys- Solutions with Iterative Methods. Sets of tems and is based on a modified Gram – Schmidt linear equations can also be solved by using it- orthonormalization. Additional information, in- erative methods; these methods have a rich his- cluding software for a variety of methods, is torical background. Some of them are discussed available [9 – 13]. in Chapter 8 and include Jacobi, Gauss – Sei- In dimensional analysis if the dimensions of del, and overrelaxation methods. As the speed each physical variable Pj (there are n of them) of computers increases, direct methods become are expressed in terms of fundamental - preferable for the general case, but for large ment units mj (such as time, length, mass; there three-dimensional problems iterative methods are m of them): are often used. [P ]=mα1j mα2j ···mαmj The conjugate gradient method is an itera- j 1 2 m tive method that can solve a set of n linear equa- then a matrix can be formed from the αij .If tions in n iterations. The method primarily re- the rank of this matrix is r, n − r independent quires multiplying a matrix by a vector, which dimensionless groups govern that phenomenon. can be done very efficiently on parallel comput- In chemical reaction engineering the chemical ers: for sparse matrices this is a viable method. reaction stoichiometry can be written as The original method was devised by Hestenes n and Stiefel [8]; however, more recent imple- αij Ci =0,j=1, 2,...,m mentations use a preconditioned conjugate gra- i=1 dient method because it converges faster, pro- where there are n species and m reactions. Then vided a good “preconditioner” can be found. The if a matrix is formed from the coefficients αij , system of n linear equations which is an n×m matrix, and the rank of the ma- trix is r, there are r independent chemical reac- Ax = f tions. The other n − r reactions can be deduced from those r reactions. 10 Mathematics in Chemical Engineering

1.3. Nonlinear Algebraic Equations successive iterates is something like 0.1, 0.01, 0.0001, 10−8. The error (when it is known) goes Consider a single nonlinear equation in one un- the same way; the method is said to be quadrati- known, cally convergent when it converges. If the deriva- f (x)=0 tive is difficult to calculate a numerical approx- imation may be used.

In Microsoft Excel, roots are found by using df f xk+ε −f xk | = Goal Seek or Solver. Assign one cell to be x, dx xk ε put the equation for f (x) in another cell, and let Goal Seek or Solver find the value of x making In the secant method the same formula is the equation cell zero. In MATLAB, the process used as for the Newton – Raphson method, ex- is similar except that a (m-file) is de- cept that the derivative is approximated by using fined and the command fzero(‘f’,x0) provides the values from the last two iterates: the solution x, starting from the initial guess x0. df f xk −f xk−1 | = Iterative methods applied to single equations dx xk xk−xk−1 include the successive substitution method     This is equivalent to drawing a straight line xk+1 = xk+βf xk ≡g xk through the last two iterate values on a plot of f (x) versus x. The Newton – Raphson method is and the Newton – Raphson method. equivalent to drawing a straight line tangent to

f xk the curve at the last x. In the method of false po- xk+1 = xk− df/dx xk sition (or regula falsi), the secant method is used to obtain xk+1, but the previous value is taken The former method converges if the derivative as either xk−1 or xk . The choice is made so that of g(x) is bounded [3]. The latter method   the function evaluated for that choice has the  dg  opposite sign to f (xk+1). This method is slower  (x) ≤µ for |x−α|  df     f x0    >0, x1−x0 =   ≤b,   0 0, then the function is evaluated at the center, xc= dx 0  df/dx (x )  x x0 >   0.5 (x1+ x2). If f (xc) 0, the root lies between  d2f  <   ≤c x1 and xc.Iff (xc) 0, the root lies between xc and  2  dx and x2. The process is then repeated. If f (xc)= > > Convergence of the Newton–Raphson method 0, the root is xc.Iff (x1) 0 and f (x2) 0, depends on the properties of the first and second more than one root may exist between x1 and x2 derivative of f (x) [3, 14]. In practice the method (or no roots). may not converge unless the initial guess is good, For systems of equations the Newton – Raph- or it may converge for some parameters and not son method is widely used, especially for equa- others. Unfortunately, when the method is non- tions arising from the solution of differential convergent the results look as though a mistake equations. occurred in the computer programming; distin- fi ({xj })=0, 1≤i, j≤n, guishing between these situations is difficult, so where {xj } =(x1,x2,...,xn)=x careful programming and testing are required. If the method converges the difference between Then, an expansion in several variables occurs: Mathematics in Chemical Engineering 11

    n ∂f   f xk+1 = f xk + i | xk+1−xk +··· Raphson method to solve for x. Since the ini- i i ∂x xk j j j=1 j tial guess is presumably pretty good, this has a high chance of being successful. That solution The Jacobian matrix is defined as ∆  is then used as the initial guess for t =2 t and  k ∂fi  the process is repeated by moving stepwise to t Jij =  xk ∂xj = 1. If the Newton – Raphson method does not converge, then ∆t must be reduced and a new and the Newton – Raphson method is solution attempted. n     Another way of using homotopy is to create k k+1 k k Jij x −x = −fi x an ordinary differential equation by differentiat- j=1 ing the homotopy equation along the path (where h = 0). For convergence, the norm of the inverse of dh [x (t) ,t] ∂h dx ∂h the Jacobian must be bounded, the norm of = + =0 the function evaluated at the initial guess must dt ∂x dt ∂t be bounded, and the second derivative must be This can be expressed as an ordinary differential bounded [14, p. 115], [3, p. 12]. equation for x (t): A review of the usefulness of solution meth- ∂h dx ∂h ods for nonlinear equations is available [16]. = − ∂x dt ∂t This review concludes that the Newton – Raph- If Euler’s method is used to solve this equation, son method may not be the most efficient. Broy- 0 den’s method approximates the inverse to the Ja- a value x is used, and dx/dt from the above cobian and is a good all-purpose method, but a equation is solved for. Then good initial approximation to the Jacobian ma- dx x1,0 = x0+∆t trix is required. Furthermore, the rate of con- dt vergence deteriorates for large problems, for is used as the initial guess and the homotopy which the Newton – Raphson method is better. 1 equation is solved for x . Brown’s method [16] is very attractive, whereas ∂h     Brent’s is not worth the extra storage and compu- x1,k+1−x1,k = −h x1,k,t tation. For large systems of equations, efficient ∂x software is available [11 – 13]. Then t is increased by ∆t and the process is re- Homotopy methods can be used to ensure peated. finding the solution when the problem is espe- In arc-length parameterization, both x and t cially complicated. Suppose an attempt is made are considered parameterized by a parameter s, to solve f (x) = 0, and it fails; however, g (x)= which is thought of as the arc length along a 0 can be solved easily, where g (x) is some func- curve. Then the homotopy equation is written tion, perhaps a simplification of f (x). Then, the along with the arc-length equation. two functions can be embedded in a homotopy ∂h dx + ∂h dt =0 by taking ∂x ds ∂t ds   T 2 h (x,t)=tf(x)+(1−t) g (x) dx dx + dt =1 ds ds ds In this equation, h can be a n×n matrix for The initial conditions are problems involving n variables; then x is a vec- 0 tor of length n. Then h (x, t) = 0 can be solved x (0) = x for t = 0 and t gradually changes until at t =1,h t (0) = 0 (x,1)=f (x). If the Jacobian of h with respect The advantage of this approach is that it works to x is nonsingular on the homotopy path (as even when the Jacobian of h becomes singu- t varies), the method is guaranteed to . In lar because the full matrix is rarely singular. Il- classical methods, the interval from t =0tot = ∆ lustrations applied to chemical engineering are 1 is broken up into N subdivisions. Set t =1/N available [17]. Software to perform these com- and solve for t = 0, which is easy by the choice putations is available (called LOCA) [18]. of g (x). Then set t = ∆t and use the Newton – 12 Mathematics in Chemical Engineering

1.4. Linear Difference Equations This gives √ −b± b2−4ac Difference equations arise in chemical engineer- ϕ1,2 = 2a ing from staged operations, such as distillation or extraction, as well as from differential equations and the solution to the difference equation is modeling adsorption and chemical reactors. The x = Aϕn+Bϕn value of a variable in the n-th stage is noted by a n 1 2 subscript n. For example, if yn,i denotes the mole where A and B are constants that must be speci- fraction of the i-th species in the vapor phase on fied by boundary conditions of some kind. the n-th stage of a distillation column, xn,i is the When the equation is nonhomogeneous, the corresponding liquid mole fraction, R the reflux solution is represented by the sum of a particular ratio (ratio of liquid returned to the column to solution and a general solution to the homoge- product removed from the condenser), and Kn,i neous equation. the equilibrium constant, then the mass balances about the top of the column give xn = xn,P +xn,H R 1 The general solution is the one found for the yn+1,i = xn,i+ x0,i R+1 R+1 homogeneous equation, and the particular so- and the equilibrium equation gives lution is any solution to the nonhomogeneous difference equation. This can be found by meth- yn,i = Kn,ixn,i ods analogous to those used to solve differential If these are combined, equations: the method of undetermined coeffi- cients and the method of variation of parame- R 1 Kn+1,ixn+1,i = xn,i+ x0,i ters. The last method applies to equations with R+1 R+1 variable coefficients, too. For a problem such as is obtained, which is a linear difference equation. xn+1−fnxn =0 This particular problem is quite complicated, x = c and the interested reader is referred to [19, Chap. 0 6]. However, the form of the difference equa- the general solution is tion is clear. Several examples are given here for n solving difference equations. More complete in- xn=c fi−1 formation is available in [20]. i=1 An equation in the form This can then be used in the method of variation xn+1−xn=fn+1 of parameters to solve the equation can be solved by xn+1−fnxn = gn n xn = fi i=1 1.5. Eigenvalues Usually, difference equations are solved analyt- ically only for linear problems. When the coeffi- The n×n matrix A has n eigenvalues λi ,i=1,. cients are constant and the equation is linear and ..,n, which satisfy homogeneous, a trial solution of the form n xn = ϕ det (A−λiI)=0 is attempted; ϕ is raised to the power n. For ex- If this equation is expanded, it can be represented ample, the difference equation as

n n−1 n−2 cxn−1+bxn+axn+1 =0 Pn (λ)=(−λ) +a1(−λ) +a2(−λ) +··· coupled with the trial solution would lead to the +an−1 (−λ)+an =0 equation If the matrix A has real entries then ai are real aϕ2+bϕ+c =0 numbers, and the eigenvalues either are real Mathematics in Chemical Engineering 13 numbers or occur in pairs as complex numbers 2) Experimental data must be fit with a math- with their complex conjugates (for definition of ematical model. The data have experimental complex numbers, see Chap. 3). The Hamilton error, so some uncertainty exists. The param- – Cayley theorem [19, p. 127] states that the ma- eters in the model as well as the uncertainty in trix A satisfies its own characteristic equation. the determination of those parameters is de-

n n−1 n−2 sired. Pn (A)=(−A) +a1(−A) +a2(−A) +··· These problems are addressed in this chapter. +an−1 (−A)+anI =0 Section 2.2 gives the properties of polynomi- A laborious way to find the eigenvalues of als defined over the whole domain and Section a matrix is to solve the n-th order polynomial 2.3 of polynomials defined on segments of the for the λi —far too time consuming. Instead the domain. In Section 2.4, quadrature methods are matrix is transformed into another form whose given for evaluating an integral. Least-squares eigenvalues are easier to find. In the Givens methods for parameter estimation for both linear method and the Housholder method the matrix and nonlinear models are given in Sections 2.5. is transformed into the tridiagonal form; then, in Fourier transforms to represent discrete data are a fixed number of calculations the eigenvalues described in Section 2.7. The chapter closes with can be found [15]. The Givens method requires extensions to two-dimensional representations. 4 n3/3 operations to transform a real symmetric matrix to tridiagonal form, whereas the House- holder method requires half that number [14]. 2.2. Global Polynomial Approximation Once the tridiagonal form is found, a Sturm se- quence is applied to determine the eigenvalues. A global polynomial Pm (x) is defined over the These methods are especially useful when only entire region of space a few eigenvalues of the matrix are desired. m j If all the eigenvalues are needed, the QRal- Pm (x)= cj x gorithm is preferred [21]. j=0 The eigenvalues of a certain tridiagonal ma- This polynomial is of degree m (highest power m trix can be found analytically. If A is a tridiago- is x ) and order m +1(m + 1 parameters {cj }). nal matrix with If a set of m + 1 points is given, aii = p, ai,i+1 = q,ai+1,i = r, qr>0 y1 = f (x1) ,y2 = f (x2) ,...,ym+1 = f (xm+1) then the eigenvalues of A are [22] then Lagrange’s formula yields a polynomial of degree m that goes through the m + 1 points: 1/2 iπ λi = p+2(qr) cos i =1,2,. . .,n n+1 (x−x2)(x−x3)···(x−xm+1) Pm (x)= y1+ (x1−x2)(x1−x3)···(x1−xm+1) This result is useful when finite difference meth- (x−x1)(x−x3)···(x−xm+1) ods are applied to the diffusion equation. y2+···+ (x2−x1)(x2−x3)···(x2−xm+1)

(x−x1)(x−x2)···(x−xm) ym+1 (xm+1−x1)(xm+1−x2)···(xm+1−xm)

2. Approximation and Integration Note that each coefficient of yj is a polynomial of degree m that vanishes at the points {xj } (ex- 2.1. Introduction cept for one value of j ) and takes the value of 1.0 at that point, i.e., Two types of problems arise frequently: Pm (xj )=yj j =1, 2,...,m+1 1) A function is known exactly at a set of points and an interpolating function is desired. The If the function f (x) is known, the error in the interpolant may be exact at the set of points, or approximation is [23] it may be a “best fit” in some sense. Alterna- |x −x |m+1 | (x)|≤ m+1 1 tively it may be desired to represent a function error (m+1)!   in some other way. max f (m+1) (x) x1≤x≤xm+1 14 Mathematics in Chemical Engineering

The evaluation of Pm (x) at a point other than The orthogonality includes a nonnegative the defining points can be made with Neville’s weight function, W (x) ≥ 0 for all a ≤ x ≤ b. algorithm [15]. Let P1 be the value at x of the This procedure specifies the set of polynomials unique function passing through the point (x1, to within multiplicative constants, which can be y1); i.e., P1= y1. Let P12 be the value at x of the set either by requiring the leading coefficient to unique polynomial passing through the points x1 be one or by requiring the norm to be one. and x2. Likewise, Pijk...r is the unique polyno- b mial passing through the points xi ,xj ,xk ,..., 2 W (x) Pm (x)dx =1 xr . The following scheme is used: a

The polynomial Pm (x) has m roots in the closed interval a to b. The polynomial

p (x)=c0P0 (x)+c1P1 (x)+···cm Pm (x) These entries are defined by using minimizes b Pi(i+1)···(i+m) = I = W (x)[f (x) −p (x)]2 dx (x−xi+m)P +(xi−x)P i(i+1)···(i+m−1) (i+1)(i+2)···(i+m) a xi−xi+m when Consider P1234: the terms on the right-hand side b of the equation involve P123 and P234. The “par- W (x) f(x)Pj (x)dx a 123 234 cj = , ents,” P and P , already agree at points Wj 2 and 3. Here i =1, m = 3; thus, the parents b i+1 i+m−1 2 agree at x ,...,x already. The for- Wj = W (x) Pj (x)dx mula makes Pi(i+1)...(i+m) agree with the func- a tion at the additional points xi+m and xi . Thus, Note that each cj is independent of m, the num- Pi(i+1)...(i+m) agrees with the function at all the ber of terms retained in the series. The minimum points {xi ,xi+1,...,xi+m }. value of I is

b m Orthogonal Polynomials. Another form of 2 2 Imin = W (x) f (x)dx− Wj cj the polynomials is obtained by defining them so a j=0 that they are orthogonal. It is required that Pm (x) be orthogonal to Pk (x) for all k =0,..., Such functions are useful for continuous data, m − 1. i.e., when f (x) is known for all x. Typical orthogonal polynomials are given in b Table 1. Chebyshev polynomials are used in W (x) Pk (x) Pm (x)dx =0 a spectral methods (see Chap. 8). The last two k =0,1,2,...,m−1 rows of Table 1 are widely used in the orthogo- nal collocation method in chemical engineering.

Table 1. Orthogonal polynomials [15, 23] abW(x) Name Recursion relation −1 1 1 Legendre (i+1) Pi+1=(2 i+1)xPi − iPi−1

1 −11√ Chebyshev T i+1=2xT i −T i−1 1−x2

q−1 p−q 01x (1−x) Jacobi ( p, q) −x2 −∞ ∞ e Hermite Hi+1=2xHi −2 iHi−1 c −x c c c 0 ∞ x e Laguerre (c)(i+1) L i+1=(−x+2 i+c+1) L i −(i+c) L i−1 0 1 1 shifted Legendre 2 0 1 1 shifted Legendre, function of x Mathematics in Chemical Engineering 15

The last entry (the shifted Legendre polynomial 2.3. Piecewise Approximation as a function of x2) is defined by 1 Piecewise approximations can be developed 2 2 2 a−1 W x Pk x Pm x x dx =0 from difference formulas [3]. Consider a case 0 k =0,1,...,m−1 in which the data points are equally spaced x −x =∆x where a = 1 is for planar, a = 2 for cylindrical, n+1 n and a = 3 for spherical geometry. These func- yn = y (xn) tions are useful if the solution can be proved to be an even function of x. forward differences are defined by

∆yn = yn+1−yn Rational Polynomials. Rational polynomi- 2 als are ratios of polynomials. A rational poly- ∆ yn =∆yn+1−∆yn = yn+2−2yn+1+yn nomial Ri(i+1)...(i+m) passing through m +1 Then, a new variable is defined points x −x α = α 0 yi = f (xi) ,i=1,. . .,m+1 ∆x is and the finite interpolation formula through the µ points y0, y1,...,yn is written as follows: R = Pµ(x) = p0+p1x+···+pµx , i(i+1)···(i+m) Q (x) q +q x+···+q xν ν 0 1 ν y = y +α∆y + α(α−1) ∆2y +···+ α 0 0 2! 0 m+1 = µ+ν+1 (1) α(α−1)···(α−n+1) n ∆ y0 An alternative condition is to make the rational n! polynomial agree with the first m + 1 terms in Keeping only the first two terms gives a straight the power series, giving a Pade´ approximation, line through (x0, y0) − (x1, y1); keeping the first i.e., three terms gives a quadratic function of posi-

k k tion going through those points plus (x2, y2). d Ri(i+1)···(i+m) d f (x) = k =0,. . .,m The value α = 0 gives x=x0; α = 1 gives x= dxk dxk x1, etc. The Bulirsch – Stoer recursion algorithm can be Backward differences are defined by used to evaluate the polynomial: ∇yn = yn−yn−1 R = R i(i+1)···(i+m) (i+1)···(i+m) 2 ∇ yn = ∇yn−∇yn−1 = yn−2yn−1+yn−2 R −R + (i+1)···(i+m) i(i+1)···(i+m−1) Den The interpolation polynomial of order n through   x−x Den = i the points y0, y−1, y−2,...is x−xi+m   y = y +α∇y + α(α+1) ∇2y +···+ R R α 0 0 2! 0 1− (i+1)···(i+m) i(i+1)···(i+m−1) −1 R(i+1)···(i+m)−R(i+1)···(i+m−1) α(α+1)···(α+n−1) ∇ny n! 0 Rational polynomials are useful for approximat- α α − ing functions with poles and singularities, which The value = 0 gives x = x0; = 1givesx= occur in Laplace transforms (see Section 4.2). x−1. Alternatively, the interpolation polynomial Fourier series are discussed in Section 4.1. of order n through the points y1, y0, y−1,...is α(α−1) 2 Representation by sums of exponentials is also yα = y1+(α−1) ∇y1+ ∇ y1+···+ possible [24]. 2! (α−1)α(α+1)···(α+n−2) ∇ny In summary, for discrete data, Legendre poly- n! 1 nomials and rational polynomials are used. For continuous data a variety of orthogonal polyno- Now α =1givesx=x1; α =0givesx=x0. mials and rational polynomials are used. When The finite element method can be used for the number of conditions (discrete data points) piecewise approximations [3]. In the finite ele- exceeds the number of parameters, then see Sec- ment method the domain a ≤ x ≤ b is divided tion 2.5. 16 Mathematics in Chemical Engineering

e into elements as shown in Figure 3. Each func- x in e-th element and ci = cI within the element tion Ni (x) is zero at all nodes except xi ; Ni e. Thus, given a set of points (xi ,yi ), a finite ele- (xi ) = 1. Thus, the approximation is ment approximation can be made to go through NT NT them. y (x)= ciNi (x)= y (xi) Ni (x) Quadratic approximations can also be used i=1 i=1 within the element (see Fig. 4). Now the trial where ci =y(xi ). For convenience, the trial functions are functions are defined within an element by using new coordinates: x−x u = i ∆xi The ∆xi need not be the same from element to element. The trial functions are defined as Ni (x) (Fig. 3 A) in the global coordinate sys- tem and NI (u) (Fig. 3 B) in the local coordinate system (which also requires specification of the element). For xi< x < xi+1

Figure 4. Finite elements approximation – quadratic ele- ments A) Global numbering system; B) Local numbering system

N =2(u−1) u− 1 I=1 2

N =4u (1−u) (3) I=2 N =2u u− 1 I=3 2

Figure 3. Galerkin finite element method – linear functions The approximation going through an odd num- A) Global numbering system; B) Local numbering system ber of points (xi ,yi ) is then 3 NT e y (x)= cI NI (u) x in e−th element I=1 y (x)= ciNi (x)=ciNi (x)+ci+1Ni+1 (x) e i=1 with cI = y (xi) ,i =(e−1) 2+I because all the other trial functions are zero in the e−th element there. Thus Hermite cubic polynomials can also be used; y (x)=ciNI=1 (u)+ci+1NI=2 (u) , these are continuous and have continuous first derivatives at the element boundaries [3]. xi

yi = Ci (xi) ,yi = Ci (xi) ,yi = Ci (xi) is defined so that C (x)=y +y (x−x )+1 y (x−x )2+ i i i i 2 i i   1 3 y −y (x−xi) 6∆xi i+1 i A number of algebraic steps make the interpola- tion easy. These formulas are written for the i-th element as well as the i − 1-th element. Then the continuity conditions are applied for the first and second derivatives, and the values y i and Figure 5. Finite elements for cubic splines A) Notation for spline knots. B) Notation for one element y i−1 are eliminated [15]. The result is yi−1∆xi−1+yi 2(∆xi−1+∆xi)+yi+1∆xi =   Consider the points shown in Figure 5 A. The −6 yi−yi−1 − yi+1−yi notation for each interval is shown in Figure 5 B. ∆xi−1 ∆xi Within each interval the function is represented This is a tridiagonal system for the set of {y i } as a cubic polynomial. in terms of the set of {yi }. Since the continu- ity conditions apply only for i =2,...,NT − 2 3 Ci (x)=a0i+a1ix+a2ix +a3ix 1, only NT − 2 conditions exist for the NT val- The interpolating function takes on specified ues of y i . Two additional conditions are needed, and these are usually taken as the value of the values at the knots. second derivative at each end of the domain, y1 , Ci−1 (xi)=Ci (xi)=f (xi) y NT . If these values are zero, the natural cu- bic splines are obtained; they can also be set to { } Given the set of values xi ,f (xi ) , the objective achieve some other purpose, such as making the is to pass a smooth curve through those points, first derivative match some desired condition at and the curve should have continuous first and the two ends. With these values taken as zero, second derivatives at the knots. in the natural cubic spline, an NT − 2 system Ci−1 (xi)=Ci (xi) of tridiagonal equations exists, which is easily solved. Once the second derivatives are known Ci−1 (xi)=Ci (xi) at each of the knots, the first derivatives are given by The formulas for the cubic spline are derived y −y ∆x ∆x as follows for one region. Since the function is y = i+1 i −y i −y i i ∆x i 3 i+1 6 a cubic function the third derivative is constant i and the second derivative is linear in x. This is The function itself is then known within each written as element.   Orthogonal Collocation on Finite Elements. x−x C (x)=C (x )+ C (x ) −C (x ) i i i i i i+1 i i ∆x In the method of orthogonal collocation on finite i elements the solution is expanded in a polyno- and integrated once to give mial of order NP = NCOL + 2 within each el- ement [3]. The choice NCOL = 1 corresponds C (x)=C (xi)+C (xi)(x−xi)+ i i i to using quadratic polynomials, whereas NCOL   2 (x−xi) = 2 gives cubic polynomials. The notation is C (xi+1) −C (xi) i i 2∆xi shown in Figure 6. Set the function to a known and once more to give value at the two endpoints

y1 = y (x1) Ci (x)=Ci (xi)+Ci (xi)(x−xi)+Ci (xi)   2 3 (x−xi) (x−xi) yNT = y (xNT ) + C (xi+1) −C (xi) 2 i i 6∆xi and then at the NCOL interior points to each el- Now ement 18 Mathematics in Chemical Engineering

Figure 6. Notation for orthogonal collocation on finite elements • Residual condition;  Boundary conditions; | Element boundary, continuity NE = total no. of elements. NT =(NCOL + 1) NE + 1

e yi = yi = y (xi) ,i=(NCOL+1) e+I This corresponds to passing a straight line through the points (x0, y0), (x1, y1) and inte- The actual points xi are taken as the roots of the orthogonal polynomial. grating under the interpolant. For equally spaced points at a=x0, a + ∆x = x1, a +2∆x = x2,.. PNCOL (u)=0gives u1,u2,...,uNCOL .,a + N ∆x = xN , a +(N +1)∆x = b=xn+1, the trapezoid rule is obtained. and then

e xi = x(e)+∆xeuI ≡xI Trapezoid Rule. b The first derivatives must be continuous at the y (x)dx = h (y +2y +2y +···+2y 2 0 1 2 N a element boundaries: 3 dy dy +yN+1)+O h | = | dx dx x=x(2)− x=x(2)+ The first five terms in Equation 1 are retained and integrated over two intervals. Within each element the interpolation is a poly- nomial of degree NCOL + 1. Overall the func- x0+2h 2 y (x)dx = y hdα = h (y +4y +y ) α 3 0 1 2 tion is continuous with continuous first deriva- x0 0 5 tives. With the choice NCOL = 2, the same ap- − h y(IV) (ξ) ,x ≤ξ≤x +2h proximation is achieved as with Hermite cubic 90 0 0 0 polynomials. This corresponds to passing a quadratic func- tion through three points and integrating. For an even number of intervals and an odd number of 2.4. Quadrature points, 2 N + 1, with a=x0, a + ∆x = x1, a +2 ∆x = x2,...,a +2N ∆x=b, Simpson’s rule To calculate the value of an integral, the func- is obtained. tion can be approximated by using each of the methods described in Section 2.3. Using the first Simpson’s Rule. three terms in Equation 1 gives b y (x)dx = h (y +4y +2y +4y +2y x0+h 1 3 0 1 2 3 4 a y (x)dx = yαhdα x 0 5 0 +···+2y2N−1+4y2N +y2N+1)+O h = h (y +y ) − 1 h3y (ξ) ,x ≤ξ≤x +h 2 0 1 12 0 0 0 Mathematics in Chemical Engineering 19

Within each pair of intervals the interpolant is When orthogonal polynomials are used, as in continuous with continuous derivatives, but only Equation 1, the m roots to Pm (x) = 0 are cho- the function is continuous from one pair to an- sen as quadrature points and called points {xj }. other. Then the quadrature is Gaussian: If the finite element representation is used 1 m (Eq. 2), the integral is y (x)dx = Wj y (xj ) j=1 xi+1 1 2 0 y (x)dx = ceN (u)(x −x )du I I i+1 i The quadrature is exact when y is a polynomial xi 0 I=1 − 2 1 of degree 2 m 1inx. The m weights and m =∆x ce N (u)du =∆x ce 1 +ce 1 i I I i 1 2 2 2 Gauss points result in 2 m parameters, chosen to I=1 0 exactly represent a polynomial of degree 2 m − = ∆xi (y +y ) 2 i i+1 1, which has 2 m parameters. The Gauss points e e Since c1 = yi and c2 = yi+1, the result is the and weights are given in Table 2. The weights same as the trapezoid rule. These formulas can can be defined with W (x) in the integrand as be added together to give linear elements: well.

b ∆xe e e Table 2. Gaussian quadrature points and weights * y (x)dx = (y1+y2) e 2 Nxi W i a 1 0.5000000000 0.6666666667 If the quadratic expansion is used (Eq. 3), the 2 0.2113248654 0.5000000000 0.7886751346 0.5000000000 endpoints of the element are xi and xi+2, and 3 0.1127016654 0.2777777778 xi+1 is the midpoint, here assumed to be equally 0.5000000000 0.4444444445 spaced between the ends of the element: 0.8872983346 0.2777777778 4 0.0694318442 0.1739274226 x i+2 1 3 0.3300094783 0.3260725774 y (x)dx = ceN (u)(x −x )du I I i+2 i 0.6699905218 0.3260725774 x 0 I=1 i 0.9305681558 0.1739274226 3 1 e 5 0.0469100771 0.1184634425 =∆xi cI NI (u)du I=1 0 0.2307653450 0.2393143353 0.5000000000 0.2844444444

e 1 e 2 e 1 0.7692346551 0.2393143353 =∆xe c +c +c 1 6 2 3 3 6 0.9530899230 0.1184634425 ∆ e For many elements, with different x , * ForagivenN the quadrature points x2 , x3 ,...,xNP−1 are quadratic elements: given above. x1 =0,xNP =1.ForN =1,W 1 = W 3 = 1/6 and for N ≥ 2, W 1 = W NP =0. b ∆x y (x)= e (ye+4ye+ye) For orthogonal collocation on finite elements e 6 1 2 3 a the quadrature formula is 1 NP If the element sizes are all the same this gives e y (x)dx = ∆xe Wj y (x ) Simpson’s rule. e J j=1 For cubic splines the quadrature rule within 0 one element is Each special polynomial has its own quadra- ture formula. For example, Gauss – Legendre xi+1 C (x)dx = 1 ∆x (y +y ) polynomials give the quadrature formula i 2 i i i+1 x ∞ i   n −x − 1 ∆x3 y +y e y (x)dx = Wiy (xi) 24 i i i+1 0 i=1 For the entire interval the quadrature formula is (points and weights are available in mathe- xNT NT−1 matical tables) [23]. y (x)dx = 1 ∆x (y +y ) 2 i i i+1 x i=1 For Gauss – Hermite polynomials the 1 quadrature formula is NT−1 − 1 ∆x3(y +y ) ∞ 24 i i i+1 n i=1 −x2 e y (x)dx = Wiy (xi) i=1 with y1 =0,y NT = 0 for natural cubic splines. −∞ 20 Mathematics in Chemical Engineering

(points and weights are available in mathemati- experimental measurements contain error; the cal tables) [23]. goal is to find the set of parameters in the model Romberg’s method uses extrapolation tech- that best represents the experimental data. Ref- niques to improve the answer [15]. If I1 is the erence [23] gives a complete treatment relating value of the integral obtained by using interval the least-squares procedure to maximum likeli- size h = ∆x, I2 the value of I obtained by using hood. interval size h/2, and I0 the true value of I, then In a least-squares parameter estimation, it is the error in a method is approximately hm ,or desired to find parameters that minimize the sum of squares of the deviation between the experi- I ≈I +chm 1 0 mental data and the theoretical equation  m h N  2 I2≈I0+c y −y(x ;a ,a ,. . .,a ) 2 χ2= i i 1 2 M σi Replacing the ≈ by an equality (an approxima- i=1 tion) and solving for c and I0 give where yi is the i-th experimental data point for 2mI −I the value xi , y(xi;a1,a2,. . .,aM ) the theoreti- I = 2 1 0 2m−1 cal equation at xi , σi the standard deviation of the i-th measurement, and the parameters This process can also be used to obtain I1, I2,.. {a1,a2,...,aM } are to be determined to mini- . , by halving h each time, calculating new esti- 2 mize χ . The simplification is made here that mates from each pair, and calling them J1, J2,. the standard deviations are all the same. Thus, . . (i.e., in the formula above, I0 is replaced with we minimize the variance of the curve fit. J1). The formulas are reapplied for each pair of J ’s to obtain K1, K2,....Theprocess continues N [y −y (x ;a ,a ,. . .,a )]2 σ2= i i 1 2 M until the required tolerance is obtained. N i=1 I1 I2 I3 I4 J1 J2 J3 K1 K2 L1 Linear Least Squares. When the model is a straight line, one is minimizing Romberg’s method is most useful for a low- N m 2 2 order method (small ) because significant im- χ = [yi−a−bxi] provement is then possible. i=1 When the integrand has singularities, a va- riety of techniques can be tried. The integral The linear correlation coefficient r is defined by may be divided into one part that can be inte- N (xi−x)(yi−y) grated analytically near the singularity and an- i=1 r=   other part that is integrated numerically. Some- N N  2  2 times a change of argument allows analytical (xi−x) (yi−y) i=1 i=1 integration. Series expansion might be help- ful, too. When the domain is infinite, Gauss – and Legendre or Gauss – Hermite quadrature can be N 2 2 2 used. Also a transformation can be made [15]. χ = 1−r [yi−y] For example, let u =1/x and then i=1 b 1/a   where y is the average of yi values. Values of r 1 1 f (x)dx = f dua,b>0 near 1 indicate a positive correlation; r near −1 u2 u a 1/b means a negative correlation, and r near zero means no correlation. These parameters are eas- ily found by using standard programs, such as Microsoft Excel. 2.5. Least Squares

When fitting experimental data to a mathemat- ical model, it is necessary to recognize that the Mathematics in Chemical Engineering 21

Polynomial Regression. In polynomial re- (For definition of i, see Chap. 3.) The Nyquist gression, one expands the function in a poly- critical frequency or critical angular frequency nomial in x. is M 1 π j−1 f = ,ω = y (x)= aj x c 2∆ c ∆ j=1 If a function y (t) is bandwidth limited to fre- The parameters are easily determined using quencies smaller than f c, i.e., computer software. In Microsoft Excel, the data is put into columns A and B and the graph is cre- Y (ω)=0for ω>ωc ated as for a linear curve fit. Then add a trendline and choose the degree of polynomial desired. then the function is completely determined by its samples yn . Thus, the entire information con- Multiple Regression. In multiple regres- tent of a signal can be recorded by sampling at a ∆−1 sion, any set of functions can be used, not just rate =2f c. If the function is not bandwidth polynomials. limited, then aliasing occurs. Once a sample rate ∆ is chosen, information corresponding to fre- M c y (x)= aj fj (x) quencies greater than f is simply aliased into j=1 that range. The way to detect this in a Fourier transform is to see if the transform approaches where the set of functions {fj (x)} is known zero at ± f c; if not, aliasing has occurred and a and specified. Note that the unknown param- {a } higher sampling rate is needed. eters j enter the equation linearly. In this Next, for N samples, where N is even case, the spreadsheet can be expanded to have a column for x, and then successive columns yk = y (tk) ,tk = k∆,k=0, 1, 2,. . .,N−1 for fj (x). In Microsoft Excel, choose Regres- ∆ sion under Tools/Data Analysis, and complete and the sampling rate is ; with only N values { } ω the form. In addition to the actual correlation, yk the complete Fourier transform Y ( ) can- ω one gets the expected variance of the unknowns, not be determined. Calculate the value Y ( n ) which allows one to assess how accurately they at the discrete points were determined. ω = 2πn ,n= − N ,...,0,...,N n N∆ 2 2

N−1 Nonlinear Regression. In nonlinear regres- 2πikn/N Yn = yke sion, the same procedure is followed except that k=0 an optimization routine must be used to find the Y (ωn)=∆Yn minimum χ2. See Chapter 10. The discrete inverse Fourier transform is 1 N −1 y = Y e−2πikn/N 2.6. Fourier Transforms of Discrete k N n Data [15] k=0 The fast Fourier transform (FFT ) is used to Suppose a signal y (t) is sampled at equal inter- calculate the Fourier transform as well as the vals inverse Fourier transform. A discrete Fourier y = y (n∆) ,n = ...,−2, −1, 0, 1, 2,... transform of length N can be written as the n sum of two discrete Fourier transforms, each of ∆=sampling rate length N/2, and each of these transforms is sep- arated into two halves, each half as long. This (e.g., number of samples per second) continues until only one component is left. For The Fourier transform and inverse transform are this reason, N is taken as a power of 2, N =2p. ∞ The vector {yj } is filled with zeroes, if need Y (ω)= y (t) eiωtdt p −∞ be, to make N =2 for some p. For the com- ∞ puter program, see [15, p. 381]. The standard y (t)= 1 Y (ω) e−iωtdω 2π −∞ 22 Mathematics in Chemical Engineering

Fourier transform takes N 2 operations to cal- 2.7. Two-Dimensional Interpolation and culate, whereas the fast Fourier transform takes Quadrature only N log2 N. For large N the difference is sig- nificant; at N = 100 it is a factor of 15, but for N Bicubic splines can be used to interpolate a set of = 1000 it is a factor of 100. values on a regular array, f (xi ,yj ). Suppose NX The discrete Fourier transform can also be points occur in the x direction and NY points used for differentiating a function; this is used occur in the y direction. Press et al. [15] sug- in the spectral method for solving differential gest computing NY different cubic splines of size equations. Consider a grid of equidistant points: NX along lines of constant y, for example, and L storing the derivative information. To obtain the x = n∆x, n =0,1,2,...,2 N−1,∆x = n 2 N value of f at some point x, y, evaluate each of these splines for that x. Then do one spline of the solution is known at each of these grid points size NY in the y direction, doing both the deter- {Y(xn )}. First, the Fourier transform is taken: mination and the evaluation. 1 2 N−1 Multidimensional can also be bro- y = Y (x )e−2ikπxn/L k 2 N n ken down into one-dimensional integrals. For n=0 example, The inverse transformation is b f2(x) b N z (x, y)dxdy = G (x)dx; 1 2ikπx/L a f (x) a Y (x)= yke 1 L f (x) k=−N 2 G (x)= z (x, y)dx which is differentiated to obtain f1(x)

dY 1 N 2πik = y e2ikπx/L dx L k L k=−N 3. Complex Variables [25 – 31] Thus, at the grid points  3.1. Introduction to the Complex Plane dY  1 N 2πik  = y e2ikπxn/L dx  L k L n k=−N A complex number is an ordered pair of real numbers, x and y, that is written as The process works as follows. From the solu- tion at all grid points the Fourier transform is z = x+iy obtained by using FFT {yk }. This is multiplied by 2 π i k/L to obtain the Fourier transform of The variable i is the imaginary unit which has the derivative: the property i2 = −1 2πik yk = yk L The real and imaginary parts of a complex num- The inverse Fourier transform is then taken by ber are often referred to: using FFT, to give the value of the derivative at Re (z)=x, Im (z)=y each of the grid points:  dY  1 N A complex number can also be represented  = y e2ikπxn/L dx  L k graphically in the complex plane, where the real n k=−N part of the complex number is the abscissa and Any nonlinear term can be treated in the same the imaginary part of the complex number is the way: evaluate it in real space at N points and take ordinate (see Fig. 7). the Fourier transform. After processing using Another representation of a complex number this transform to get the transform of a new func- is the polar form, where r is the magnitude and θ tion, take the inverse transform to obtain the new is the argument.  function at N points. This is what is done in di- r = |x+i y| = x2+y2,θ= arg (x+i y) rect numerical simulation of turbulence (DNS). Mathematics in Chemical Engineering 23

The magnitude of z1+ z2 is bounded by

|z1±z2|≤|z1| + |z2| and |z1|−|z2|≤|z1±z2| as can be seen in Figure 8. The magnitude and arguments in multiplication obey

|z1z2| = |z1||z2| ,arg (z1z2) = arg z1+arg z2 The complex conjugate is z*=x− i y when z = x+i y and |z*| = |z|,argz* = − arg z

Figure 7. The complex plane

Write Figure 8. Addition in the complex plane z = x+i y = r (cosθ+i sinθ) so that For complex conjugates then x = rcos θ, y = rsin θ z∗ z = |z|2 and The reciprocal is y   θ = arctan 1 z∗ 1 1 x = = (cosθ−i sinθ) , arg = −arg z z |z|2 r z Since the arctangent repeats itself in multiples of π rather than 2 π, the argument must be de- Then θ fined carefully. For example, the given above z1 = x1+i y1 =(x +i y ) x2−i y2 z x +i y 1 1 x2+y2 could also be the argument of − (x+i y). The 2 2 2 2 2 function z = cos θ + i sin θ obeys |z| = |cos θ + = x1x2+y1y2 +i x2y1−x1y2 x2+y2 x2+y2 i sin θ| =1. 2 2 2 2 The rules of equality, addition, and multipli- and cation are z1 r1 = [cos (θ1−θ2) +i sin (θ1−θ2)] z1 = x1+i y1,z2 = x2+i y2 z2 r2

Equality : z1 = z2 if and only if x1 = x2 and y1 = y2 Addition : 3.2. Elementary Functions z1+z2 =(x1+x2)+i(y1+y2) Multiplication : Properties of elementary functions of complex z1z2 =(x1x2−y1y2)+i(x1y2+x2y1) variables are discussed here [32]. When the po- lar form is used, the argument must be specified The last rule can be remembered by using the because the same physical angle can be achieved standard rules for multiplication, keeping the with arguments differing by 2 π. A complex 2 − imaginary parts separate, and using i = 1. In number taken to a real power obeys the complex plane, addition is illustrated in Fig- n ure 8. In polar form, multiplication is u = zn, |zn| = |z| , arg (zn)=n arg z (mod2π)

n n z1z2 = r1r2 [cos (θ1+θ2) +i sin (θ1+θ2)] u = z = r (cos nθ+i sin nθ) 24 Mathematics in Chemical Engineering

Roots of a complex number are complicated by The circular functions with complex argu- careful accounting of the argument ments are defined i z −i z i z −i z 1/n cos z = e +e , sin z = e −e , z = w with w = R (cos Θ+i sin Θ) , 0≤Θ≤2π 2 2 tan z = sin z then cos z   z = R1/n{cos Θ +(k−1) 2π and satisfy k n n   sin (−z)=−sin z, cos (−z) = cos z +i sin Θ +(k−1) 2π } n n sin (i z) = i sinh z, cos (i z)=−cosh z such that All trigonometric identities for real, circular n (zk) = w for every k functions with real arguments can be extended without change to complex functions of complex z = r (cos θ+i sin θ) arguments. For example, rn = R, nθ = Θ (mod 2π) sin2z+cos2z =1, The exponential function is sin (z1+z2) = sin z1 cos z2+cos z1 sin z2 ez =ex (cos y+i sin y) The same is true of hyperbolic functions. The Thus, absolute boundaries of sin z and cos z are not bounded for all z. z = r (cos θ+i sin θ) Trigonometric identities can be defined by us- can be written ing iθ z = reiθ e = cos θ+i sin θ For example, and ei(α+β) = cos (α+β) +i sin (α+β) |ez| =ex, argez = y (mod 2π) =eiαeiβ = (cos α+i sin α) The exponential obeys (cos β+i sin β) z = cos α cos β−sin α sin β e =0 for every finite z +i (cos α sin β+cos β sin α) and is periodic with period 2 π: Equating real and imaginary parts gives ez+2π i =ez cos (α+β) = cos α cos β− sin α sin β

Trigonometric functions can be defined by sin (α+β) = cos α sin β+cos β sin α using The logarithm is defined as iy −iy e = cos y+i sin y, and e = cos y−i sin y ln z =ln|z| +i arg z Thus, and the various determinations differ by multi- i y −i y ples of 2 π i. Then, cos y = e +e = coshi y 2 eln z = z i y −i y sin y = e −e = −i sinhi y 2i ln (ez) −z≡0 (mod 2 πi) The second equation follows from the defini- Also, tions ln (z z ) −ln z −ln z ≡0 (mod 2 πi) ez+e−z ez−e−z 1 2 1 2 cosh z≡ , sinh z≡ 2 2 is always true, but

The remaining hyperbolic functions are ln (z1 z2)=lnz1+ln z2 tanh z≡ sinh z , coth z≡ 1 cosh z tanh z holds only for some determinations of the loga- rithms. The principal determination of the argu- sech z≡ 1 , csch z≡ 1 cosh z sinh z ment can be defined as − π

3.3. Analytic Functions of a Complex Analyticity implies continuity but the con- Variable verse is not true: z*=x− i y is continuous but, because the Cauchy – Riemann conditions are Let f (z) be a single-valued not satisfied, it is not analytic. An entire func- of z in a domain D. The function f (z) is differ- tion is one that is analytic for all finite values of z. entiable at the point z0 inDif Every polynomial is an entire function. Because f (z +h) −f (z ) the polynomials are analytic, a ratio of polyno- lim 0 0 mials is analytic except when the denominator h→0 h vanishes. The function f (z)=|z2 | is continuous exists as a finite (complex) number and is in- for all z but satisfies the Cauchy – Riemann con- dependent of the direction in which h tends to ditions only at z = 0. Hence, f (z) exists only zero. The limit is called the derivative, f (z0). at the origin, and |z|2 is nowhere analytic. The The derivative now can be calculated with h ap- function f (z)=1/z is analytic except at z = 0. Its proaching zero in the complex plane, i.e., any- derivative is − 1/z2, where z = 0. If ln z =ln| where in a circular region about z0. The function z| +iargz in the cut domain − π

g (w)=g [f (z)] = z 26 Mathematics in Chemical Engineering

Laplace Equation. If f (z) is analytic, where The integral is linear with respect to the inte- grand: f (z)=u (x, y)+iv (x, y)  [α1 f1 (z)+α2 f2 (z)] dz C the Cauchy – Riemann equations are satisfied.   Differentiating the Cauchy – Riemann equa- = α1 f1 (z)dz+α2 f2 (z)dz tions gives the Laplace equation: C C

2 2 2 2 The integral is additive with respect to the path. ∂ u = ∂ v = ∂ v = − ∂ u ∂x2 ∂x∂y ∂y∂x ∂y2 or Let curve C2 begin where curve C1 ends and 2 2 C1+ C2 be the path of C1 followed by C2. Then, ∂ u + ∂ u =0 ∂x2 ∂y2 f (z)dz = f (z)dz+ f (z)dz Similarly, C1+C2 C1 C2 ∂2v ∂2v + =0 Reversing the orientation of the path replaces the ∂x2 ∂y2 integral by its negative: Thus, general solutions to the Laplace equation can be obtained from analytic functions [30, p. f (z)dz = − f (z)dz 60]. For example, −C C 1 If the path of integration consists of a finite num- ln |z−z0| ber of arcs along which z (t) has a continuous derivative, then is analytic so that a solution to the Laplace equa- 1 tion is f (z)dz = f [z (t)] z (t)dt  −1/2 ln (x−a)2+(y−b)2 C 0 Also if s (t) is the arc length on C and l (C )is A solution to the Laplace equation is called a the length of C   harmonic function. A function is harmonic if,     and only if, it is the real part of an analytic func-    f (z)dz ≤max |f (z)| l (C) tion. The imaginary part is also harmonic. Given   z∈C any harmonic function u, a conjugate harmonic C function v can be constructed such that f=u+ and     1 i v is locally analytic [30, p. 290].      f (z)dz ≤ |f (z)||dz| = |f [z (t)]| ds ( t)   Maximum Principle. If f (z) is analytic in a C C 0 domain D and continuous in the set consisting of | |≤ D and its boundary C, and if f (z) M on C, Cauchy’s Theorem [25, 30, p. 111]. Sup- | | < then f (z) M in D unless f (z) is a constant pose f (z) is an analytic function in a domain [30, p. 134]. D and C is a simple, closed, rectifiable curve in D such that f (z) is analytic inside and on C. Then 3.4. Integration in the Complex Plane  f (z)dz =0 (4) Let C be a rectifiable curve in the complex plane C

C:z = z (t) , 0≤t≤ 1 If D is simply connected, then Equation 4 holds for every simple, closed, rectifiable curve C in where z (t) is a continuous function of bounded D. If D is simply connected and if a and b are variation; C is oriented such that z1= z (t1) pre- any two points in D, then cedes the point z2= z (t2)onC if and only if t1< b t2. Define f (z)dz 1 a f (z)dz = f [z (t)] dz (t) is independent of the rectifiable path joining a C 0 and b in D. Mathematics in Chemical Engineering 27

Cauchy’s Integral. If C is a closed contour In particular,  such that f (z) is analytic inside and on C, z0 is 1 A = f (z)dz a point inside C, and z traverses C in the coun- −1 2πi terclockwise direction, C 1 f(z) f (z0)= dz where the curve C is a closed, counterclockwise 2πi z−z0 C curve containing z0 and is within the neighbor- f (z )= 1 f(z) dz 0 2πi (z−z )2 hood where f (z) is analytic. The complex num- C 0 ber A−1 is the residue of f (z) at the isolated Under further restriction on the domain [30, p. singular point z0;2π i A−1 is the value of the 127], integral in the positive direction around a path  m! f (z) containing no other singular points. f (m) (z )= dz 0 m+1 If f (z) is defined and analytic in the exterior 2πi (z−z0) C |z − z0 | > R of a circle, and if   1 1 v (ζ)=f z0+ obtained by ζ = Power Series. If f (z) is analytic interior to ζ z−z0 a circle |z − z0 | < r0, then at each point inside ζ the circle the series has a removable singularity at =0,f (z)isan- alytic at infinity. It can then be represented by ∞ f (n) (z ) f (z)=f (z )+ 0 (z−z )n a Laurent series with nonpositive powers of z − 0 n! 0 n=1 z0. If C is a closed curve within which and on converges to f (z). This result follows from z which f (z) is analytic except for a finite number Cauchy’s integral. As an example, e is an en- of singular points z1, z2,...,zn interior to the tire function (analytic everywhere) so that the region bounded by C, then the residue theorem MacLaurin series states ∞ zn  ez =1+ n! f (z)dz =2πi(?1+?2+···?n) n=1 C represents the function for all z. Another result of Cauchy’s integral formula where n denotes the residue of f (z)atzn . is that if f (z) is analytic in an annulus R, r1< |z The series of negative powers in Equation 5 − z0 | < r2, it is represented in R by the Laurent is called the principal part of f (z). If the princi- series pal part has an infinite number of nonvanishing ∞ terms, the point z0 is an essential singularity. n f (z)= An(z−z0) ,r1< |z−z0|≤r2 If A−m = 0, A−n = 0 for all m < n, then z0 is n=−∞ called a pole of order m. It is a simple pole if m where = 1. In such a case,  A = 1 f(z) dz, A ∞ n 2 π i (z−z )n+1 −1 n C 0 f (z)= + An(z−z0) z−z0 n =0, ±1, ±2,..., n=0 If a function is not analytic at z0 but can be made and C is a closed curve counterclockwise in R. so by assigning a suitable value, then z0 is a re- movable singular point. Singular Points and Residues [33, p. 159, When f (z) has a pole of order m at z0, 30, p. 180]. If a function in analytic in every m neighborhood of z0, but not at z0 itself, then z0 is ϕ (z)=(z−z0) f (z) , 0< |z−z0|

Also | f (z)|→∞as z → z0 when z0 is a pole. is the analytic continuation onto the entire z Let the function p (z) and q (z) be analytic at z0, plane except for z =1. where p (z0) = 0. Then An extension of the Cauchy integral formula p (z) is useful with Laplace transforms. Let the curve f (z)= q (z) C be a straight line parallel to the imaginary axis and z0 be any point to the right of that (see Fig. k has a simple pole at z0 if, and only if, q (z0)=0 9). A function f (z) is of order z as |z|→∞if and q (z0) = 0. The residue of f (z) at the simple positive numbers M and r0 exist such that   pole is  −k  z f (z) r0, i.e., p (z ) A = 0 −1 k q (z0) |f (z)|

(i−1 If q ) (z0)=0,i=1,...,m, then z0 is a pole of f (z) of order m.

Branch [33, p. 163]. A branch of a multiple- valued function f (z) is a single-valued function that is analytic in some region and whose value at each point there coincides with the value of f (z) at the point. A branch cut is a boundary that is needed to define the branch in the greatest pos- sible region. The function f (z) is singular along a branch cut, and the singularity is not isolated. For example, √   z1/2 = f (z)= r cos θ +i sin θ 1 2 2

−π<θ<π, r>0 is double valued along√ the negative real axis.√ The function tends to ri when θ → π and to − ri when θ →−π; the function has no limit as z → Figure 9. Integration in the complex plane − r (r > 0). The ray θ = π is a branch cut.

Analytic Continuation [33, p. 165]. If f 1(z) Theorem [33, p. 167]. Let f (z) be analytic −k is analytic in a domain D1 and domain D con- when R (z) ≥ γ and O (z )as|z|→∞in that tains D1, then an analytic function f (z) may ex- half-plane, where γ and k are real constants and ist that equals f 1(z)inD1. This function is the k > 0. Then for any z0 such that R (z0) >γ analytic continuation of f 1(z) onto D, and it is γ+iβ unique. For example, 1 f (z) f (z0)=− lim dz, ∞ 2 π i β→∞ z−z0 n γ−iβ f1 (z)= z , |z| <1 n=0 i.e., integration takes place along the line x=γ. is analytic in the domain D1 : |z| < 1. The se- ries diverges for other z. Yet the function is the MacLaurin series in the domain 3.5. Other Results 1 f (z)= , |z| <1 1 1−z Theorem [32, p. 84]. Let P (z) be a polyno- mial of degree n having the zeroes z1, z2,...,zn Thus, and let π be the least convex polygon containing 1 the zeroes. Then P (z) cannot vanish anywhere f (z)= 1 1−z in the exterior of π. Mathematics in Chemical Engineering 29

a ∞ If a polynomial has real coefficients, the roots f (x)= 0 + (a cos nx+b sin nx) 2 n n are either real or form pairs of complex conju- n=1 gate numbers. where The radius of convergence R of the Taylor se- π π ries of f (z) about z0 is equal to the distance from a = 1 f (x)dx, a = 1 f (x) cos nxdx, 0 π n π z0 to the nearest singularity of f (z). −π −π π b = 1 f (x) sin nxdx n π Conformal Mapping. Let u (x, y)beahar- −π monic function. Introduce the coordinate trans- The values {an } and {bn } are called the finite formation cosine and sine transform of f, respectively. Be- x =ˆx (ξ, η) ,y=ˆy (ξ, η) cause

cos nx= 1 einx+e−inx It is desired that 2

sin nx= 1 einx−e−inx U (ξ, η)=u [ˆx (ξ, η) , yˆ (ξ, η)] and 2i the Fourier series can be written as be a harmonic function of ξ and η. ∞ Theorem [30, p. 284]. The transformation −inx f (x)= cne z = f (ζ) (6) n=−∞ where takes all harmonic functions of x and y into har- monic functions of ξ and η if and only if either 1 (a +i b ) n≥0 c = 2 n n for ζ ζ ζ ξ n 1 (a −i b ) n<0 f ( )orf*( ) is an analytic function of = + 2 −n −n for i η. Equation 6 is a restriction on the transforma- and π tion which ensures that 1 c = f (x)einx dx ∂2u ∂2u ∂2U ∂2U n 2 π if + =0then + =0 −π ∂x2 ∂y2 ∂ζ2 ∂η2 If f is real Such a mapping with f (ζ) analytic and f (ζ) = ∗ 0 is a conformal mapping. c−n = cn. If Laplace’s equation is to be solved in the region exterior to a closed curve, then the point If f is continuous and piecewise continuously at infinity is in the domain D. For flow in a long differentiable channel (governed by the Laplace equation) the ∞ f (x)= (−i n) c e−inx inlet and outlet are at infinity. In both cases the n −∞ transformation az+b If f is twice continuously differentiable ζ = ∞ z−z0 2 −inx f (x)= −n cne takes z0 into infinity and hence maps D into a −∞ bounded domain D*. Inversion. The Fourier series can be used to solve linear partial differential equations with 4. Integral Transforms [34 – 39] constant coefficients. For example, in the prob- lem 4.1. Fourier Transforms 2 ∂T = ∂ T ∂t ∂x2

Fourier Series [40]. Let f (x) be a function T (x, 0) = f (x) that is periodic on − π

Let If f is real F [ f (− ω)] = F [ f (ω)*]. The real ω ∞ part is an even function of and the imaginary −inx T = cn (t)e part is an odd function of ω. −∞ A function f (x)isabsolutely integrable if the Then, improper integral ∞ ∞ dc ∞ n e−inx = c (t) −n2 e−inx |f (x)| dx dt n −∞ −∞ −∞ Thus, cn (t) satisfies has a finite value. Then the improper integral

dcn 2 −n2t ∞ = −n cn, or cn = cn (0) e dt f (x)dx Let cn (0) be the Fourier coefficients of the ini- −∞ tial conditions: converges. The function is square integrable if ∞ ∞ f (x)= c (0) e−inx n 2 −∞ |f (x)| dx −∞ The formal solution to the problem is ∞ has a finite value. If f (x) and g (x) are square −n2t −inx T = cn (0) e e integrable, the product f (x) g (x) is absolutely −∞ integrable and satisfies the Schwarz inequality:    2  ∞   f (x) g (x)dx Fourier Transform [40]. When the func- −∞  tion f (x) is defined on the entire real line, the ∞ ∞  2  2 Fourier transform is defined as ≤ |f (x)| dx |g (x)| dx −∞ −∞ ∞ F [f] ≡fˆ(ω)= f (x)eiωxdx The triangle inequality is also satisfied: 1/2 −∞ ∞ |f+g|2dx This integral converges if −∞ 1/2 1/2 ∞ ∞ ∞  2  2 |f (x)| dx ≤ |f| dx |g| dx −∞ −∞ −∞ A sequence of square integrable functions f n does. The inverse transformation is (x) converges in the mean to a square integrable ∞ 1 function f (x)if f (x)= fˆ(ω)e−iωxdω 2 π ∞ −∞ 2 lim |f (x) −fn (x)| dx =0 n→∞ If f (x) is continuous and piecewise continuously −∞ differentiable, The sequence also satisfies the Cauchy criterion ∞ ∞ f (x)eiωxdx 2 lim |fn−fm| dx =0 −∞ n→∞ m→∞ −∞ converges for each ω, and Theorem [40, p. 307]. If a sequence of lim f (x)=0 x→±∞ square integrable functions f n (x) converges to a function f (x) uniformly on every finite then interval a ≤ x ≤ b, and if it satisfies Cauchy’s   d f criterion, then f (x) is square integrable and f n F = −iωF [f] dx (x) converges to f (x) in the mean. Mathematics in Chemical Engineering 31

∧ Theorem (Riesz – Fischer) [40, p. 308]. To (ω) is odd. If f (x) is real and even, then f (ω) every sequence of square integrable functions ∧ f f n (x) that satisfy Cauchy’s criterion, there cor- is real and even. If f (x) is real and odd, then responds a square integrable function f (x) such (ω) is imaginary and odd. If f (x) is imaginary ∧ that f n (x) converges to f (x) in the mean. Thus, and even, then f (ω) is imaginary and even. If f the limit in the mean of a sequence of functions ∧ is defined to within a null function. (x) is imaginary and odd, then f (ω) is real and Square integrable functions satisfy the Parse- odd. val equation. ∞ ∞ Convolution [40, p. 326].  2  ˆ  2 f (ω) dω =2π |f (x)| dx ∞ f∗h (x ) ≡ f (x −x) h (x )dx −∞ −∞ 0 0 −∞ ∞ This is also the total power in a signal, which can = 1 eiωx0 fˆ(−ω) hˆ (ω)dω 2 π be computed in either the time or the frequency −∞ domain. Also Theorem. The product ∞ ∞ fˆ(ω)ˆg∗ (ω)dω =2π f (x) g∗ (x)dx fˆ (ω) hˆ (ω) −∞ −∞ is the Fourier transform of the convolution prod- Fourier transforms can be used to solve dif- uct f*h. The convolution permits finding in- ferential equations too. Then it is necessary to verse transformations when solving differential find the inverse transformation. If f (x) is square equations. To solve integrable, the Fourier transform of its Fourier 2 ∂T = ∂ T transform is 2 π f (− x), or ∂t ∂x2   ∞ T (x, 0) = f (x) , −∞

f (x)= 1 F [F f (−x)] f (−x)= 1 F [F [f]] dTˆ +ω2Tˆ =0 2 π or 2 π dt

Tˆ (ω, 0) = fˆ(ω) Properties of Fourier Transforms [40, p. 324], [15]. The solution is   ˆ ˆ −ω2t F d f = −i ω F [f]=iωfˆ T (ω, t)=f (ω)e dx

F [i xf (x)] = d F [f]= d fˆ The inverse transformation is dω dω ∞ 1 2 F [f (ax−b)] = 1 eiωb/afˆ ω T (x, t)= e−iωx fˆ(ω)e−ω tdω |a| a 2 π ! " −∞ F eicxf (x) = fˆ(ω+c)   Because 1 ˆ ˆ F [cos ω0xf (x)] = f (ω+ω0)+f (ω−ω0)   2 2 1 2   e−ω t = F √ e−x /4t F [sin ω xf (x)] = 1 fˆ(ω+ω ) −fˆ(ω−ω ) 4 πt 0 2i 0 0 ! " the convolution integral can be used to write the −iω x ˆ F e 0 f (x) = f (ω−ω0) solution as − ω ˆ ω ∞ If f (x) is real, then f ( )=f*( ). If f (x)is 1 2 ∧ T (x, t)= √ f (y)e−(x−y) /4tdy imaginary, then f (− ω)=− f*ˆ (ω). If f (x)is 4 πt ∧ ∧ −∞ even, then f (ω) is even. If f (x) is odd, then f 32 Mathematics in Chemical Engineering

Finite Fourier Sine and Cosine Transform Table 3. Finite sine transforms [41] fs (n)= F (x)(0

They obey the operational properties n sinh c(π−x)   2 2 sinh cπ 2 n +c F n d f = −n2 F n [f] s dx2 s n (|k| =0, 1, 2,...) sin k(π−x) n2−k2 sin kπ + 2 n [f (0) − (−1)n f (π)] π π 0(n =m);fs (m)= 2 sin mx (m =1,2,...) f, f are continuous, f is piecewise continu- n [1−(−1)ncos kx] kx | k | = ous on 0 ≤ x ≤ π. n2−k2 cos ( 1,2,...)   n n+m 2 2 1−(−1) , n −m cos m x(m =1,2,...) (n =m);fs (m)=0 The material is reproduced with permission of McGrawHill, Inc. Also,   and f 1 is the extension of f, k is a constant F n d f = −nF n [f] s dx c   F n d f = nF n [f] − 2 f (0) +(−1)n 2 f (π) c dx s π π When two functions F (x) and G (x) are defined on the interval − 2 π 0. The π F (x) cos nxdx (n =0,1,...) Laplace transform of F (t) is [41] 0 ∞ n −st (−1) f c (n) F (π−x) L [F ]=f (s)= e F (t)dt

0 0 when n=1,2,...;f c (0)=π 1  1(0 n2+c2 c sinh cπ bounded for all t T, for some finite T.

n The unit step function is (−1) cos kπ−1 (|k| =0,1,...) 1 sin kx n2−k2 k 00≤tk n2−m2 c m

1 (|k| =0,1,...) − cos k(π−x) and its Laplace transform is n2−k2 k sin kx 0(n =m);f (m)= π (m =1,2,...) c 2 cos mx e−ks L [S (t)] = k s The material is reproduced with permission of McGrawHill, Inc. In particular, if k = 0 then On the semi-infinite domain, 0 < x < ∞, the 1 Fourier sine and cosine transforms are L [1] = ∞ s ω Fs [f] ≡ f (x) sin ωxdx, 0 The Laplace transforms of the first and second ∞ ω derivatives of F (t) are Fc [f] ≡ f (x) cos ωxdx and 0   f (x)= 2 F ω [F ω [f]] ,f(x)= 2 F ω [F ω [f]] L d F = sf (s) −F (0) π s s π c c dt   ω 2 The sine transform is an odd function of , L d F = s2 f (s) −sF (0) − d F (0) whereas the cosine function is an even function dt2 dt of ω. Also,   More generally, ω d2 f 2 ω   F = f (0) ω−ω F [f] n s dx2 s L d F = sn f (s) − sn−1 F (0)   dtn ω d2 f d f 2 ω F = − (0) −ω F [f] n−1 c dx2 dx c −sn−2 dF (0) − ··· − d F (0) dt dtn−1 provided f (x) and f (x) → 0asx →∞. Thus, the sine transform is useful when f (0) is known The inverse Laplace transformation is and the cosine transform is useful when f (0) −1 F (t)=L [f (s)] where f (s)=L [F ] is known. Hsu and Dranoff [42] solved a chemical en- The inverse Laplace transformation is not unique gineering problem by applying finite Fourier because functions that are identical except for 34 Mathematics in Chemical Engineering isolated points have the same Laplace transform. Substitution. They are unique to within a null function. Thus, ! " f (s−a)=L eat F (t) if This can be used with polynomials. Suppose L [F1]=f (s) and L [F2]=f (s) 1 1 2 s+3 f (s)= + = it must be that s s+3 s (s+3)

F2 = F1+N (t) Because T 1 L [1] = where (N (t)) dt =0for every T s 0 Laplace transforms can be inverted by using Ta- Then ble 5, but knowledge of several rules is helpful. F (t) = 1+e−3t,t≥0

Table 5. Laplace transforms (see [23] for a more complete list) More generally, translation gives the following. L [F] F (t) Translation. 1 1 s    ! " f (as−b)=f a s− b = L 1 ebt/a F t , 1 a a a s2 t

1 tn−1 a>0 sn (n−1)! The step function √1 √1 s πt  1   00≤t< −3/2  h s 2 t/π  S (t)= 1 1 ≤t< 2 h h Γ (k) k−1  k (k>0) t  s 2 2 ≤t< 3 h h at 1 e s−a has the Laplace transform 1 (n =1,2,...) 1 tn−1eat (s−a)n (n−1)! 1 1 L [S (t)] = −hs Γ (k) k−1 at s 1−e k (k>0) t e (s−a) δ −   The Dirac delta function (t t0) (see Equa- 1 1 eat−ebt → (s−a)(s−b) a−b tion 116 in Mathematical Modeling) has the   property s 1 aeat−bebt (s−a)(s−b) a−b ∞ δ (t−t ) F (t)dt = F (t ) 1 1 sin at 0 0 s2+a2 a 0 s cos at s2+a2 Its Laplace transform is 1 1 sinh at −st s2−a2 a L [δ (t−t0)]=e 0 ,t0≥0,s>0

s cosh at s2−a2 The square wave function illustrated in Figure

s t sin at 10 has Laplace transform (s2+a2)2 2 a 1 cs L [Fc (t)] = tanh s2−a2 s 2 t cos at (s2+a2)2 The triangular wave function illustrated in Fig- 1 1 eatsin bt (s−a)2+b2 b ure 11 has Laplace transform

s−a at 1 cs (s−a)2+b2 e cos bt L [T (t)] = tanh c s2 2 Other Laplace transforms are listed in Table 5. Mathematics in Chemical Engineering 35

Partial Fractions [43]. Suppose q (s) has m factors

q (s)=(s−a1)(s−a2) ··· (s−am) All the factors are linear, none are repeated, and the an are all distinct. If p (s) has a smaller de- gree than q (s), the Heaviside expansion can be used to evaluate the inverse transformation:   m p (s) p (ai) L−1 = eait q (s) q (a ) Figure 10. Square wave function i=1 i If the factor (s − a) is repeated m times, then A f (s)= p (s) = Am + m−1 + q (s) (s−a)m (s−a)m−1

···+ A1 +h (s) s−a where m ϕ (s) ≡ (s−a) p (s) q (s)

m−k A = ϕ (a) ,A = 1 d ϕ (s) | , m k (m−k)! dsm−k a k =1,. . .,m−1 Figure 11. Triangular wave function The term h (s) denotes the sum of partial frac- Convolution properties are also satisfied: tions not under consideration. The inverse trans- formation is then t  tm−1 tm−2 F (t) ∗G (t)= F (τ) G (t−τ)dτ F (t)=eat A +A +··· m (m−1) ! m−1 (m−2) ! 0  t +A +A +H (t) and 2 1! 1 f (s) g (s)=L [F (t) ∗G (t)] The term in F (t) corresponding to s−a in q (s) is ϕ (a)eat Derivatives of Laplace Transforms. The 2 Laplace integrals L [F (t)], L [tF(t)], L [t2 F (s−a) in q (s) is [ϕ (a)+ϕ (a) t]eat (t)],...areuniformly convergent for s1 ≥ α and ! " (s−a)3 q (s) 1 ϕ (a)+2ϕ (a) t+ϕ (a) t2 eat in is 2 lim f (s)=0, lim L [tn F (t)]=0,n=1, 2,... s→∞ s→∞ For example, let 1 and f (s)= 2 dn f (s−2) (s−1) = L [(−t)n F (t)] dsn For the factor s − 2, 1 Integration of Laplace Transforms. ϕ (s)= ,ϕ(2) = 1 (s−1)2 ∞   F (t) f (ξ)dξ = L − 2 t For the factor (s 1) , s ϕ (s)= 1 ,ϕ (s)=− 1 s−2 (s−2)2 If F (t) is a periodic function, F (t)=F (t+a), then ϕ (1) = −1,ϕ (1) = −1 a f (s)= 1 e−stF (t)dt, The inverse Laplace transform is then 1−e−as 0 F (t)=e2t+[−1−t]et where F (t)=F (t+a) 36 Mathematics in Chemical Engineering

y (s)= a +2 y (s) s Quadratic Factors. Let p (s) and q (s)have s s2+1 real coefficients, and q (s) have the factor a (s2+1) y (s)= or s (s−1)2 (s−a)2+b2,b>0 Taking the inverse transformation gives ! " where a and b are real numbers. Then define ϕ Y (t)=s 1+2tet (s) and h (s) and real constants A and B such that Next, let the variable s in the Laplace trans- f (s)= p (s) = ϕ (s) form be complex. F (t) is still a real-valued func- q (s) (s−a)2+b2 tion of the positive real variable t. The properties = As+B +h (s) (s−a)2+b2 given above are still valid for s complex, but ad- ditional techniques are available for evaluating ϕ ϕ Let 1 and 2 be the real and imaginary parts of the integrals. The real-valued function is O [exp ϕ a+ b the complex number ( i ). (x0t)]: x t ϕ (a+i b) ≡ϕ1+i ϕ2 |F (t)| To solve ordinary differential equations by x0 and is absolutely convergent there; it is uni- using these results: formly convergent on x ≥ x1> x0. dn f n = L [(−t) F (t)] n =1, 2, . . .,x>x Y (t) −2Y (t)+Y (t)=e2t dsn 0 and f ∗ (s)=f (s∗) Y (0)=0,Y (0)=0 The functions | f (s)| and |xf (s)| are bounded Taking Laplace transforms in the half-plane x ≥ x1> x0 and f (s) → 0as|   ! " 1 y|→∞for each fixed x. Thus, L Y (t) −2L Y (t) +L [Y (t)] = s−2 |f (x+iy)| x0 s2 y (s) −sY (0) −Y (0) −2[sy (s) −Y (0)] +y (s) lim f (x+iy)=0, x>x0 1 y→±∞ = s−2 If F (t) is continuous, F (t) is piecewise contin- and combining terms uous, and both functions are O [exp (x0t)], then | f (s)| is O (1/s) in each half-plane x ≥ x1> x0. s2−2 s+1 y (s)= 1 s−2 |sf (s)| x0 To solve an integral equation: The additional constraint F (0) = 0 is necessary and sufficient for | f (s)| to be O (1/s2). t Y (t)=a+2 Y (τ) cos (t−τ)dτ Inversion Integral [41]. Cauchy’s integral 0 formula for f (s) analytic and O (s−k ) in a half- ≥ > it is written as plane x y, k 0, is γ+iβ 1 f (z) Y (t)=a+Y (t) ∗cos t f (s)= lim dz, Re (s) >γ 2 πi β→∞ s−z Then the Laplace transform is used to obtain γ−iβ Mathematics in Chemical Engineering 37 ∞ Applying the inverse Laplace transformation on F (t)= ?n (t) either side of this equation gives n=1 γ+iβ When sn is a simple pole 1 zt F (t)= lim e f (z)dz zt ?n (t) = lim (z−sn)e f (z) 2 πi β→∞ z→s γ−iβ n s t =e n lim (z−sn) f (z) If F (t) is of order O [exp (x0t)] and F (t) and z→sn F (t) are piecewise continuous, the inversion When integral exists. At any point t0, where F (t)is p (z) discontinuous, the inversion integral represents f (z)= q (z) the mean value 1 where p (z) and q (z) are analytic at z=sn ,p F (t0) = lim [F (t0+ε)+F (t0−ε)] = ε→∞ 2 (sn ) 0, then

p (sn) When t = 0 the inversion integral represents 0.5 ? (t)= esnt n F (O +) and when t < 0, it has the value zero. q (sn) If f (s) is a function of the complex variable If sn is a removable pole of f (s), of order m, −k−m s that is analytic and of order O (s )onR then ≥ 0 > (s) x , where k 1 and m is a positive inte- m ger, then the inversion integral converges to F ϕn (z)=(z−sn) f (z) (t) and is analytic at sn and the residue is γ+iβ m−1 n  Φ (s ) ∂ ! " d F = 1 lim eztzn f (z)dz, ? (t)= n n Φ (z)= ϕ (z)ezt dtn 2 πi n where n m−1 n β→∞γ−iβ (m−1)! ∂z n =1,2,...,m An important inversion integral is when 1   Also F (t) and its n derivatives are continuous f (s)= exp −s1/2 functions of t of order O [exp (x0t)] and they s vanish at t =0. The inverse transform is     F (0) = F (0) = ···F (m) (0)=0 1 1 F (t)=1− erf √ = erfc √ 2 t 2 t where erf is the error function and erfc the com- Series of Residues [41]. Let f (s) be an ana- plementary error function. lytic function except for a set of isolated singu- lar points. An isolated singular point is one for which f (z) is analytic for 0 < |z − z0 | <but 4.3. Solution of Partial Differential z0 is a singularity of f (z). An isolated singular point is either a pole, a removable singularity, or Equations by Using Transforms an essential singularity. If f (z) is not defined in the neighborhood of z0 but can be made analytic A common problem facing chemical engineers at z0 simply by defining it at some additional is to solve the heat conduction equation or dif- points, then z0 is a removable singularity. The fusion equation function f (z) has a pole of order k ≥ 1atz0 if ∂T ∂2T ∂c ∂2c k ?Cp = k or = D (z − z0) f (z) has a removable singularity at z0 ∂t ∂x2 ∂t ∂x2 − k−1 whereas (z z0) f (z) has an unremovable The equations can be solved on an infinite do- isolated singularity at z0. Any isolated singular- main −∞< x < ∞, a semi-infinite domain 0 ity that is not a pole or a removable singularity ≤ x < ∞, or a finite domain 0 ≤ x ≤ L.Ata is an essential singularity. boundary, the conditions can be a fixed tem- Let the function f (z) be analytic except for perature T (0, t)=T 0 (boundary condition of the isolated singular point s1, s2,...,sn . Let n zt the first kind, or Dirichlet condition), or a fixed (t) be the residue of e f (z)atz = sn (for defi- ∂T flux −k ∂x (0,t)=q0 (boundary condition of nition of residue, see Section 3.4). Then the second kind, or Neumann condition), or a 38 Mathematics in Chemical Engineering

∂T combination −k ∂x (0,t)=h [T (0,t) − T0] Problem 2. Semi-infinite domain, boundary (boundary condition of the third kind, or Robin condition of the first kind, on 0 ≤ x ≤∞ condition). T (x, 0) = T0 = constant The functions T 0 and q0 can be functions of  p time. All properties are constant ( ,C ,k,D,h), T (0,t)=T1 = constant so that the problem remains linear. Solutions are presented on all domains with various boundary The solution is conditions for the heat conduction problem.   √  T (x,t)=T0+[T1−T0] 1− erf x/ 4αt ∂T ∂2T k = α ,α=   2 √ ∂t ∂x ?Cp or T (x, t)=T0+(T1−T0) erfc x/ 4αt Problem 1. Infinite domain, on −∞< x < ∞. Problem 3. Semi-infinite domain, boundary condition of the first kind, on 0 ≤ x < ∞ T (x, 0) = f (x) , initial conditions T (x, 0) = f (x) T (x, t) bounded T (0,t)=g (t) Solution is via Fourier transforms ∞ The solution is written as Tˆ (ω, t)= T (x, t)eiωx dx T (x, t)=T1 (x, t)+T2 (x, t) −∞ Applied to the differential equation where   2 F ∂ T = −ω2 αF [T ] T1 (x, 0) = f (x) ,T2 (x, 0)=0 ∂x2 T (0,t)=0,T (0,t)=g (t) ∂Tˆ +ω2 αTˆ =0, Tˆ (ω, 0) = fˆ (ω) 1 2 ∂t By solving Then T 1 is solved by taking the sine transform −ω2αt ω Tˆ (ω, t)=fˆ(ω)e U1 = Fs [T1] the inverse transformation gives [40, p. 328], ∂U 1 = − ω2 αU [44, p. 58] ∂t 1 L 1 −iωx ˆ −ω2αt ω T (x, t)= lim e f (ω)e dω U1 (ω, 0) = Fs [f] 2 π L→∞ −L Thus, Another solution is via Laplace transforms; ω −ω2αt take the Laplace transform of the original differ- U1 (ω, t)=Fs [f]e ential equation. and [40, p. 322] ∂2t st(s, x) −f (x)=α ∞ ∂x2 2 2 T (x, t)= F ω [f]e−ω αtsin ωx dω 1 π s This equation can be solved with Fourier trans- 0 forms [40, p. 355] 2 ∞ # $ % Solve for T by taking the sine transform 1 s t (s, x)= √ e − |x−y| f (y)dy ω 2 sα α U2 = Fs [T2] −∞ ∂U2 = −ω2 αU +αg(t) ω The inverse transformation is [40, p. 357], [44, ∂t 2 p. 53] U2 (ω, 0) = 0 T (x, t)= ∞ Thus, 1 2 √ e−(x−y) /4αt f (y)dy 2 παt −∞ Mathematics in Chemical Engineering 39

t T (x, 0) = f (x) −ω2α(t−τ) U2 (ω, t)= e αω g (τ)dτ 0 ∂T k (0,t)=hT(0,t) and [40, p. 435] ∂x ∞ t Take the Laplace transform 2 α 2 T (x, t)= ω sin ωx e−ω α(t−τ) g (τ)dτdω 2 ∂2t π st−f (x)=α 2 0 0 ∂x k ∂t (0,s)=ht (0,s) The solution for T 1 can also be obtained by ∂x Laplace transforms. The solution is t1 = L [T1] ∞ Applying this to the differential equation t (x, s)= f (ξ) g (x, ξ, s)dξ 0 ∂2 t st −f (x)=α 1 ,t (0,s)=0 1 ∂x2 1 where [41, p. 227]     and solving gives 2 s/α g (x, ξ, s) = exp −|x−ξ| s/α x √ √    √ s−h α/k 1 − s/α(x−x) + √ √ exp − (x+ξ) s/α t = √ e f x dx s+h α/k 1 sα 0 One form of the inverse transformation is [41, p. and the inverse transformation is [40, p. 437], 228] [44, p. 59] ∞ ∞ T (x, t)= 2 e−β2αtcos [βx−µ (β)] f (ξ) π T1 (x, t)= (7) 0 0 ∞  cos [βξ−µ (β)] dξdβ 1 2 2 √ e−(x−ξ) /4αt−e−(x+ξ) /4αt f (ξ)dξ 4 παt µ (β) = arg (β+ih/k) 0 Problem 4. Semi-infinite domain, boundary Another form of the inverse transformation conditions of the second kind, on 0 ≤ x < ∞. when f = T 0= constant is [41, p. 231], [44, p. 71] T (x, 0) = 0    x hx/k h2 αt/k2 T (x, t)=T0 erf √ +e e 4 αt ∂T ( )' −k (0,t)=q0 = constant √ ∂x h αt x erfc + √ Take the Laplace transform k 4 αt t (x, s)=L [T (x, t)] Problem 6. Finite domain, boundary condi- tion of the first kind 2 st= α ∂ t ∂x2 T (x, 0) = T0 = constant −k ∂t = q0 ∂x s T (0,t)=T (L, t)=0 The solution is Take the Laplace transform √ q α √ t (x, s)= 0 e−x s/α ∂2t ks3/2 st(x, s) −T0 = α ∂x2 The inverse transformation is [41, p. 131], [44, p. 75] t (0,s)=t (L, s)=0 & $  ' q0 αt −x2/4αt x The solution is T (x, t)= 2 e −x erfc √ k π 4 αt * * sinh s x sinh s (L−x) T α T α T t (x, s)=− 0 * − 0 * + 0 Problem 5. Semi-infinite domain, boundary s sinh s L s sinh s L s conditions of the third kind, on 0 ≤ x < ∞ α α 40 Mathematics in Chemical Engineering

The inverse transformation is [41, p. 220], [44, 2 p. 96] st (x, s) −T = α ∂ t 0 ∂x2

2 2 −n2π2αt/L2 nπx T (x, t)= T0 e sin ∂t (0,s)=0,t (L, s)=0 π n=1,3,5,... n L ∂x or (depending on the inversion technique) [40, The solution is &  ' pp. 362, 438] T cosh x s/α t (x, s)= 0 1−  L s cosh L s/α ∞  T0 −[(x−ξ)+2nL]2/4αt T (x, t)= √ e Its inverse is [41, p. 138] 4 παt n=−∞ 0 ∞    n (2n+1) L−x  T (x, t)=T0−T0 (−1) erfc √ 2 4 αt −e−[(x+ξ)+2nL] /4αt dξ n=0   (2n+1) L+x +erfc √ 4 αt Problem 7. Finite domain, boundary condi- tion of the first kind T (x, 0) = 0 5. Vector Analysis

T (0,t)=0 Notation. A scalar is a quantity having mag- nitude but no direction (e.g., mass, length, time, T (L, 0) = T0 = constant temperature, and concentration). A vector is a quantity having both magnitude and direc- Take the Laplace transform tion (e.g., displacement, velocity, acceleration, 2 st (x, s)=α ∂ t ∂x2 force). A second-order dyadic has magnitude and two directions associated with it, as defined t (0,s)=0,t (L, s)= T0 s precisely below. The most common examples are the stress dyadic (or tensor) and the veloc- The solution is  ity gradient (in fluid flow). Vectors are printed x T0 sinh s/α in boldface type and identified in this chap- t (x, s)= L s sinh s/α ter by lower-case Latin letters. Second-order dyadics are printed in boldface type and are and the inverse transformation is [41, p. 201], identified in this chapter by capital or Greek [44, p. 313] letters. Higher order dyadics are not discussed & ' ∞ n x 2 (−1) 2 2 2 nπx here. Dyadics can be formed from the compo- T (x, t)=T + e−n π αt/L sin 0 L π n L nents of tensors including their directions, and n=1 some of the identities for dyadics are more eas- An alternate transformation is [41, p. 139], [44, ily proved by using tensor analysis, which is not p. 310] presented here (see also, → Transport Phenom- ∞    ena, Chap. 1.1.4). Vectors are also first-order (2 n+1) L+x T (x, t)=T0 erf √ dyadics. n=0 4 αt   Vectors. Two vectors u and v are equal if they (2 n+1) L−x −erf √ have the same magnitude and direction. If they 4 αt have the same magnitude but the opposite di- rection, then u = − v. The sum of two vectors Problem 8. Finite domain, boundary condi- is identified geometrically by placing the vector tion of the second kind v at the end of the vector u, as shown in Fig- ure 12. The product of a scalar m and a vector T (x, 0) = T0 u is a vector m u in the same direction as u but ∂T with a magnitude that equals the magnitude of (0,t)=0,T (L, t)=0 ∂x u times the scalar m. Vectors obey commutative and associative laws. Take the Laplace transform Mathematics in Chemical Engineering 41 u + v = v + u Commutative law for addition u +(v + w)=(u + v)+w Associative law for addition m u = u m Commutative law for scalar multiplication m (n u)=(mn) u Associative law for scalar multiplication (m + n) u = m u + n u Distributive law m (u + v)=m u + m v Distributive law

The same laws are obeyed by dyadics, as well.

Figure 13. Cartesian coordinate system

Figure 12. Addition of vectors

A unit vector is a vector with magnitude 1.0 and some direction. If a vector has some magni- tude (i.e., not zero magnitude), a unit vector eu can be formed by Figure 14. Vector components u e = u |u| Dyadics. The dyadic A is written in compo- The original vector can be represented by the nent form in cartesian coordinates as product of the magnitude and the unit vector. A = Axxexex+Axyexey+Axzexez u = |u| eu +Ayxeyex+Ayyeyey+Ayzeyez In a cartesian coordinate system the three prin- ciple, orthogonal directions are customarily re- +Azxezex+Azyezey+Azzezez presented by unit vectors, such as {ex , ey , ez } or {i, j, k}. Here, the first notation is used (see Quantities such as ex ey are called unit dyadics. Fig. 13). The coordinate system is right handed; They are second-order dyadics and have two di- that is, if a right-threaded screw rotated from the rections associated with them, ex and ey ; the x to the y direction, it would advance in the z di- order of the pair is important. The components rection. A vector can be represented in terms of Axx ,...,Azz are the components of the tensor its components in these directions, as illustrated Aij which here is a 3×3 matrix of numbers that in Figure 14. The vector is then written as are transformed in a certain way when variables undergo a linear transformation. The yx mo- u = uxex+uyey+uzez mentum flux can be defined as the flux of x mo- mentum across an area with unit normal in the The magnitude is * y direction. Since two directions are involved, a 2 2 2 |u| = ux+uy+uz second-order dyadic (or tensor) is needed to re- present it, and because the y momentum across The position vector is an area with unit normal in the x direction may not be the same thing, the order of the indices r = xe +ye +ze x y z must be kept straight. The dyadic A is said to be with magnitude symmetric if  |r| = x2+y2+z2 Aij = Aji 42 Mathematics in Chemical Engineering

Here, the indices i and j can take the values x, u · v = v · u Commutative law for scalar products y,orz; sometimes (x, 1), (y, 2), (x, 3) are iden- u · (v + w)=u · v + u · w Distributive law for scalar tified and the indices take the values 1, 2, or 3. products The dyadic A is said to be antisymmetric if ex · ex = ey · ey = ez · ez =1 ex · ey = ex · ez = ey · ez =0 A = −A ij ji If the two vectors u and v are written in com- The transpose of A is ponent notation, the scalar product is

T u·v = ux vx+uy vy+uz vz Aij = Aji If u · v = 0 and u and v are not null vectors, then Any dyadic can be represented as the sum of u and v are perpendicular to each other and θ = a symmetric portion and an antisymmetric por- π/2. tion. The single dot product of two dyadics is   A = B +C , B ≡ 1 (A +A ) , ij ij ij ij 2 ij ji A·B = eiej Aik Bkj i j k C ≡ 1 (A −A ) ij 2 ij ji The double dot product of two dyadics is

A:B = Aij Bji An ordered pair of vectors is a second-order i j dyadic. Because the dyadics may not be symmetric, the uv = eiejui vj order of indices and which indices are summed i j are important. The order is made clearer when The transpose of this is the dyadics are made from vectors. (uv) · (wx)=u (v·w) x = ux(v·w) (uv)T = vu but (uv):(wx)= (u·x)(v·w) The dot product of a dyadic and a vector is uv=vu   A·u = ei Aij uj The Kronecker delta is defined as i j The cross or vector product is defined by 1 if i = j δij = 0 if i=j c = u×v = a |u||v| sin θ, 0 ≤θ≤π a u×v and the unit dyadic is defined as where is a unit vector in the direction of . The direction of c is perpendicular to the plane δ = eiej δij of u and v such that u, v, and c form a right- i j handed system. If u = v,oru is parallel to v, then θ = 0 and u×v = 0. The following laws are Operations. The dot or scalar product of valid for cross products. two vectors is defined as u×v =−v×u Commutative law fails for u·v = |u||v| cos θ, 0≤θ≤π vector product u×(v×w) = (u×v)×w Associative law fails for where θ is the angle between u and v. The scalar vector product product of two vectors is a scalar, not a vector. It u×(v + w)=u×v + u×w Distributive law for vector product is the magnitude of u multiplied by the projec- ex ×ex = ey ×ey = ez ×ez =0 tion of v on u, or vice versa. The scalar product ex ×ey = ez , ey ×ez = ex , ez ×ex = of u with itself is just the square of the magnitude ey   of u. ex ey ez     u×v = det  u u u  u·u = u2 = u2 x y z vx vy vz The following laws are valid for scalar products = ex (uy vz−vy uz)+ey (uz vz−ux vz)

+ez (ux vy−uy vx) Mathematics in Chemical Engineering 43

This can also be written as can be formed where λ is an eigenvalue. This expression is u×v = εkij ui vj ek i j 3 2 λ −I1λ +I2λ−I3 =0 where  An important theorem of Hamilton and Cayley   1 if i,j,k is an even permutation of 123 [47] is that a second-order dyadic satisfies its εijk = −1 if i,j,k is an odd permutation of 123  own characteristic equation. 0 if any two of i,j,k are equal 3 2 A −I1A +I2A−I3δ =0 (7) Thus ε123 =1,ε132 =- 1, ε312 =1,ε112 = 0, for example. Thus A3 can be expressed in terms of δ, A, and The magnitude of u×v is the same as the area A2. Similarly, higher powers of A can be ex- of a parallelogram with sides u and v.Ifu×v = pressed in terms of δ, A, and A2. Decomposi- 0 and u and v are not null vectors, then u and v tion of a dyadic into a symmetric and an anti- are parallel. Certain triple products are useful. symmetric part was shown above. The antisym- (u·v) w=u (v·w) metric part has zero trace. The symmetric part can be decomposed into a part with a trace (the u· (v×w)=v· (w×u)=w· (u×v) isotropic part) and a part with zero trace (the deviatoric part). u× (v×w)=(u·w) v− (u·v) w T T A = 1/3δδδδ : A + 1/2 [A + A - 2/3δδδδ : A] + 1/2 [A - A ] (u×v) ×w =(u·w) v− (v·w) u Isotropic Deviatoric Antisymmetric The cross product of a dyadic and a vector is defined as   Differentiation. The derivative of a vector is A×u = eiej εklj Aik ul i j k l defined in the same way as the derivative of a scalar. Suppose the vector u depends on t. Then The magnitude of a dyadic is $ $ du u (t+∆t) −u (t) 1 1 = lim |A| = A = (A:AT)= A2 dt ∆t→0 ∆t 2 2 i j ij If the vector is the position vector r (t), then the There are three invariants of a dyadic. They difference expression is a vector in the direction are called invariants because they take the same of ∆ r (see Fig. 15). The derivative value in any coordinate system and are thus an dr ∆r r (t+∆t) −r (t) intrinsic property of the dyadic. They are the = lim = lim trace of A, A2, A3 [45]. dt ∆t→0 ∆t ∆t→0 ∆t  is the velocity. The derivative operation obeys I=trA = Aii i the following laws.   = A2 = A A II tr i j ij ji d (u+v)= du + dv dt dt dt    = A3 = A A A III tr i j k ij jk ki d (u·v)= du ·v+u· dv dt dt dt The invariants can also be expressed as d (u×v)= du ×v+u× dv dt dt dt I1 =I d (ϕu)= dϕ u+ϕ du dt dt dt I = 1 I2− 2 2 II If the vector u depends on more than one vari- I = 1 I3−3I· +2 = det A 3 6 II III able, such as x, y and z, partial derivatives are defined in the usual way. For example, Invariants of two dyadics are available [46]. Be- cause a second-order dyadic has nine compo- if u (x, y, z) , then nents, the characteristic equation ∂u = lim u(x+∆x, y, z)−u(x, y, z) ∂x ∆x det (λ δ−A)=0 ∆t→0 44 Mathematics in Chemical Engineering

Rules for differentiation of scalar and vector Differential Operators. The vector differ- products are ential operator (del operator) ∇ is defined in ∂ (u·v)= ∂u ·v+u· ∂v cartesian coordinates by ∂x ∂x ∂x ∂ ∂ ∂ ∂ (u×v)= ∂u ×v+u× ∂v ∇ = ex +ey +ez ∂x ∂x ∂x ∂x ∂y ∂z Differentials of vectors are The gradient of a scalar function is defined du =du e +du e +du e x x y y z z ∂ϕ ∂ϕ ∂ϕ ∇ϕ = ex +ey +ez d(u·v)=du·v+u·dv ∂x ∂y ∂z ϕ d(u×v)=du×v+u×dv and is a vector. If is height or elevation, the gra- dient is a vector pointing in the uphill direction. du = ∂u dx+ ∂u dy+ ∂u dz The steeper the hill, the larger is the magnitude ∂x ∂y ∂z of the gradient. The divergence of a vector is defined by ∂u ∂u ∂u ∇∇·u = x + y + z ∂x ∂y ∂z and is a scalar. For a volume element ∆V, the net outflow of a vector u over the surface of the element is u·ndS Figure 15. Vector differentiation ∆S This is related to the divergence by [48, p. 411] If a curve is given by r (t), the length of the 1 curve is [43] ∇∇·u = lim u·ndS ∆V →0 ∆V b $ ∆S dr dr L = · dt dt dt Thus, the divergence is the net outflow per unit a volume. The arc-length function can also be defined: The curl of a vector is defined by t $   ∂ ∂ ∂ dr dr ∗ ∇∇×u = e +e +e s (t)= · dt x ∂x y ∂y z ∂z dt∗ dt∗ a × (exux+eyuy+ezuz) This gives             ∂uz ∂uy ∂ux ∂uz ds 2 dr dr dx 2 dy 2 dz 2 = ex − +ey − = · = + + ∂y ∂z ∂z ∂x dt dt dt dt dt dt   +e ∂uy − ∂ux Because z ∂x ∂y dr =dxex+dyey+dzez and is a vector. It is related to the integral then u·ds = usds 2 2 2 2 ds =dr·dr =dx +dy +dz C C The derivative dr/dt is tangent to the curve in the which is called the circulation of u around path direction of motion C. This integral depends on the vector and the dr contour C, in general. If the circulation does not u =  dt     dr  depend on the contour C, the vector is said to be dt irrotational; if it does, it is rotational. The rela- Also, tionship with the curl is [48, p. 419] dr u = ds Mathematics in Chemical Engineering 45 1 n· (∇∇×u) = lim u·ds invariant scalar field is invariant; the same is true ∆→0 ∆S C for the divergence and curl of invariant vectors Thus, the normal component of the curl equals fields. the net circulation per unit area enclosed by the The gradient of a vector field is required in contour C. fluid mechanics because the velocity gradient is The gradient, divergence, and curl obey a dis- used. It is defined as   tributive law but not a commutative or associa- ∂vj ∇v = i j eiej and tive law. ∂xi   T ∂vi (∇v) = eiej ∇ (ϕ+ψ)=∇ϕ+∇ψ i j ∂xj ∇∇· (u+v)=∇∇·u+∇∇·v The divergence of dyadics is defined     ∂τji ∇∇× (u+v)=∇∇×u+∇∇×v ∇∇·τ = ei and i j ∂xj   ∇∇·ϕ=ϕ∇   ∂ ∇∇· (ϕuv)= ei (ϕuj vi) i j ∂xj ∇∇·u=u·∇∇ where τ is any second-order dyadic. Useful formulas are [49] Useful relations involving dyadics are ∇∇· (ϕu)=∇ϕ·u+ϕ∇∇·u (ϕ δ :∇v)=ϕ (∇∇·v)

∇∇× (ϕu)=∇ϕ×u+ϕ∇∇×u ∇∇· (ϕ δ)=∇ϕ

∇∇· (u×v)=v· (∇∇×u) −u· (∇∇×v) ∇∇· (ϕτ )=∇ϕ·τ +ϕ∇∇·τ

∇∇× (u×v)=v·∇∇u−v (∇∇·u) −u·∇∇v+u (∇∇·v) nt:τ = t·τ ·n = τ :nt ∇∇× (∇∇×u)=∇ (∇∇·u) −∇2u A surface can be represented in the form

2 2 2 ∇∇· (∇ϕ)=∇2ϕ = ∂ ϕ + ∂ ϕ + ∂ ϕ , ∇2 f (x, y, z)=c = constant ∂x2 ∂y2 ∂z2 where is called the Laplacian operator. ∇∇×(∇ϕ)=0. The normal to the surface is given by ϕ ∇ f The curl of the gradient of is zero. n = |∇∇ f| ∇∇· (∇∇×u)=0 provided the gradient is not zero. Operations can The divergence of the curl of u is zero. Formulas be performed entirely within the surface. Define useful in fluid mechanics are δ ≡δ− nn, ∇ ≡δ ·∇∇, ∂ ≡n·∇∇ ∇∇·(∇v)T = ∇ (∇∇·v) II II II ∂n

v ≡δ ·v,vn≡n·v ∇∇· (τ·v)=v· (∇∇·τ)+τ :∇v II II Then a vector and del operator can be decom- v·∇∇v = 1 ∇ (v·v) −v× (∇∇×v) 2 posed into If a coordinate system is transformed by a rota- ∂ v = v +nvn, ∇ = ∇ +n tion and translation, the coordinates in the new II II ∂n system (denoted by primes) are given by The velocity gradient can be decomposed into     x l l l 11 12 13 ∂vII ∂vn     ∇v = ∇IIvII+(∇IIn) vn+n∇IIvn+n +nn  y  =  l21 l22 l23  ∂n ∂n z l31 l32 l33     The surface gradient of the normal is the nega- x a1 tive of the curvature dyadic of the surface.      y  +  a2  ∇IIn = −B z a3 The surface divergence is then Any function that has the same value in all coor- dinate systems is an invariant. The gradient of an ∇II·v = δII : ∇v = ∇II·vII−2 Hvn 46 Mathematics in Chemical Engineering where H is the mean curvature. or if f, g, and h are regarded as the x, y, and z 1 components of a vector v: H = δII : B 2 ∇∇×v =0 The surface curl can be a scalar Consequently, the line integral is independent of

∇II×v = −εII :∇v = −εII:∇IIvII = −n· (∇∇×v) , the path (and the value is zero for a closed con- tour) if the three components in it are regarded εII = n·ε as the three components of a vector and the vec- tor is derivable from a potential (or zero curl). or a vector The conditions for a vector to be derivable from

∇II×v≡∇∇II×vII = n∇II×v = nn·∇∇∇×v a potential are just those in the third theorem. In two dimensions this reduces to the more usual theorem. Vector Integration [48, pp. 206 – 212]. If u Theorem [48, p. 207]. If M and N are contin- is a vector, then its integral is also a vector. uous functions of x and y that have continuous    first partial derivatives in a simply connected do- u (t)dt = ex ux (t)dt+ey uy (t)dt main D, then the necessary and sufficient condi-  +ez uz (t)dt tion for the line integral If the vector u is the derivative of another vector, (Mdx+Ndy) then C dv dv to be zero around every closed curve C inDis u = , u (t)dt = dt = v+constant dt dt ∂M ∂N = If r (t) is a position vector that defines a curve ∂y ∂x C, the line integral is defined by If a vector is integrated over a surface with incremental area d S and normal to the surface u·dr = (uxdx+uydy+uzdz) n, then the surface integral can be written as C C u·dS = u·ndS Theorems about this line integral can be written in various forms. S S Theorem [43]. If the functions appearing in If u is the velocity then this integral represents the line integral are continuous in a domain D, the flow rate past the surface S. then the line integral is independent of the path C if and only if the line integral is zero on every Divergence Theorem [48, 49]. If V is a vol- simple closed path in D. ume bounded by a closed surface S and u is Theorem [43]. If u = ∇ϕ where ϕ is single- a vector function of position with continuous valued and has continuous derivatives in D, then derivatives, then the line integral is independent of the path C and ∇∇·udV = n·udS = u·ndS = u·dS the line integral is zero for any closed curve in D. V S S S Theorem [43]. If f, g, and h are continuous where n is the normal pointing outward to S. The functions of x, y, and z, and have continuous normal can be written as first derivatives in a simply connected domain n = e cos (x, n)+e cos (y, n)+e cos (z, n) D, then the line integral x y z where, for example, cos (x, n) is the cosine of the (f dx+gdy+hdz) angle between the normal n and the x axis. Then C the divergence theorem in component form is    is independent of the path if and only if ∂ux + ∂uy + ∂uz dxdydz = ∂x ∂y ∂z V ∂h ∂g ∂f ∂h ∂g ∂f  = , = , = ∂y ∂z ∂z ∂x ∂x ∂y [uxcos (x, n)+uycos (y, n)+uzcos (z, n)] dS S Mathematics in Chemical Engineering 47

If the divergence theorem is written for an incre- Stokes Theorem [48, 49]. Stokes theorem mental volume says that if S is a surface bounded by a closed, 1 nonintersecting curve C, and if u has continuous ∇∇·u = lim undS ∆V →0 ∆V derivatives then ∆S  u·dr = (∇∇×u) ·ndS = (∇∇×u) ·dS the divergence of a vector can be called the in- tegral of that quantity over the area of a closed C S S volume, divided by the volume. If the vector re- The integral around the curve is followed in the presents the flow of energy and the divergence is counterclockwise direction. In component nota- positive at a point P, then either a source of en- tion, this is ergy is present at P or energy is leaving the region [u cos (x, s)+u cos (y, s)+u cos (z, s)] ds = around P so that its temperature is decreasing. x y z C   If the vector represents the flow of mass and the ∂uz − ∂uy cos (x, n) ∂y ∂z divergence is positive at a point P, then either a S  + ∂ux − ∂uz cos (y, n) source of mass exists at P or the density is de- ∂z ∂x creasing at the point P. For an incompressible    + ∂uy − ∂ux cos (z, n) dS fluid the divergence is zero and the rate at which ∂x ∂y fluid is introduced into a volume must equal the Applied in two dimensions, this results in rate at which it is removed. Green’s theorem in the plane: Various theorems follow from the divergence    ∂N ∂M theorem. (Mdx+Ndy)= − dxdy ϕ ∂x ∂y Theorem. If is a solution to Laplace’s equa- C S tion The formula for dyadics is ∇2ϕ =0  n· (∇∇×τ )dS = τ T·dr in a domain D, and the second partial deriva- S C tives of ϕ are continuous in D, then the integral of the normal derivative of ϕ over any piece- wise smooth closed orientable surface S inDis Representation. Two theorems give infor- zero. Suppose u = ϕ ∇ψ satisfies the conditions mation about how to represent vectors that obey of the divergence theorem: then Green’s theo- certain properties. rem results from use of the divergence theorem Theorem [ 48, p. 422]. The necessary and [49]. sufficient condition that the curl of a vector van- ∂ψ ish identically is that the vector be the gradient ϕ∇2 ψ+∇ϕ·∇∇ψ dV = ϕ dS ∂n of some function. V S Theorem [ 48, p. 423]. The necessary and and sufficient condition that the divergence of a vec-   tor vanish identically is that the vector is the curl ∂ψ ∂ϕ ϕ∇2ψ−ψ∇2ϕ dV = ϕ −ψ dS of some other vector. ∂n ∂n V S Leibniz Formula. In fluid mechanics and ϕ Also if satisfies the conditions of the theorem transport phenomena, an important result is the ϕ and is zero on S then is zero throughout D. If derivative of an integral whose limits of inte- ϕ ψ two functions and both satisfy the Laplace gration are moving. Suppose the region V (t)is equation in domain D, and both take the same moving with velocity vs . Then Leibniz’s rule ϕ ψ values on the bounding curve C, then = ; i.e., holds: the solution to the Laplace equation is unique. d ∂ϕ The divergence theorem for dyadics is ϕdV = dV + ϕvs·ndS dt ∂t V (t) V (t) S ∇∇·τ dV = n·τ dS

V S 48 Mathematics in Chemical Engineering

∂ eθ ∂ ∂ Curvilinear Coordinates. Many of the re- ∇ = er + +ez , ∂r r ∂θ ∂z lations given above are proved most easily by ∂ϕ e ∂ϕ ∂ϕ ∇ϕ = e + θ +e using tensor analysis rather than dyadics. Once r ∂r r ∂θ z ∂z   2 2 proven, however, the relations are perfectly gen- ∇2ϕ = 1 ∂ r ∂ϕ + 1 ∂ ϕ + ∂ ϕ eral in any coordinate system. Displayed here are r ∂r ∂r r2 ∂θ2 ∂z2 2 2 2 the specific results for cylindrical and spherical = ∂ ϕ + 1 ∂ϕ + 1 ∂ ϕ + ∂ ϕ ∂r2 r ∂r r2 ∂θ2 ∂z2 geometries. Results are available for a few other 1 ∂ 1 ∂v ∂v ∇∇·v = (rv )+ θ + z geometries: parabolic cylindrical, paraboloidal, r ∂r r r ∂θ ∂z     elliptic cylindrical, prolate spheroidal, oblate ∇∇×v = e 1 ∂vz − ∂vθ +e ∂vr − ∂vz spheroidal, ellipsoidal, and bipolar coordinates r r ∂θ ∂z θ ∂z ∂r   [45, 50]. 1 ∂ 1 ∂vr +ez (rvθ) − For cylindrical coordinates, the geometry is r ∂r r ∂θ  ∇∇·τ = e 1 ∂ (rτ )+1 ∂τθr + ∂τzr − τθθ + shown in Figure 16. The coordinates are related r r ∂r rr r ∂θ ∂z r to cartesian coordinates by   +e 1 ∂ r2τ + 1 ∂τθθ + ∂τzθ + τθr−τrθ + θ r2 ∂r rθ r ∂θ ∂z r   +e 1 ∂ (rτ )+1 ∂τθz + ∂τzz z r ∂r rz r ∂θ ∂z ∇v = e e ∂vr +e e ∂vθ +e e ∂vz + r r ∂r r θ ∂r r z ∂r     +e e 1 ∂vr − vθ +e e 1 ∂vθ + vr + θ r r ∂θ r θ θ r ∂θ r

+e e 1 ∂vz +e e ∂vr +e e ∂vθ +e e ∂vz θ z r ∂θ z r ∂z z θ ∂z z z ∂z    2 2 ∇2v = e ∂ 1 ∂ (rv ) + 1 ∂ vr + ∂ vr r ∂r r ∂r r r2 ∂θ2 ∂z2  − 2 ∂vθ + r2 ∂θ

   2 2  +e ∂ 1 ∂ (rv ) + 1 ∂ vθ + ∂ vθ + 2 ∂vr + θ ∂r r ∂r θ r2 ∂θ2 ∂z2 r2 ∂θ     2 2 +e 1 ∂ r ∂vz + 1 ∂ vz + ∂ vz z r ∂r ∂r r2 ∂θ2 ∂z2 For spherical coordinates, the geometry is shown in Figure 17. The coordinates are related to cartesian coordinates by

Figure 16. Cylindrical coordinate system

 x = rcosθr= x2+y2  y y = rsinθθ= arctan x z = zz= z

The unit vectors are related by

er = cosθex + sinθey ex = cosθer − sinθeθ eθ = − sinθex + cosθey ey = sinθer + cosθeθ ez = ez ez = ez

Derivatives of the unit vectors are deθ = −erdθ, der = eθdθ, dez =0

Differential operators are given by [45] Figure 17. Spherical coordinate system Mathematics in Chemical Engineering 49  ∂v θ ϕr= x2+y2+z2 ∂vr ∂vθ φ x = r sin cos ∇v = erer +ereθ +ereφ +   ∂r ∂r ∂r y = r sinθ sinϕθ= arctan x2+y2/z     1 ∂vr vθ 1 ∂vθ vr  y +eθer − +eθeθ + + z = r cosθϕ= arctan r ∂θ r r ∂θ r x   ∂v v +e e 1 φ +e e 1 ∂vr − φ + The unit vectors are related by θ φ r ∂θ φ r rsinθ ∂ϕ r   v e = sin θcos ϕe +sin θsin ϕe +cos θe +e e 1 ∂vθ − φ cotθ + r x y z φ θ rsinθ ∂ϕ r   eθ = cos θcos ϕex+cos θsin ϕey−sin θez ∂v +e e 1 φ + vr + vθ cotθ φ φ rsinθ ∂ϕ r r eφ = −sin ϕex+cos ϕey    e = sin θcos ϕe +cos θcos ϕe −sin ϕe ∇2v = e ∂ 1 ∂ r2 v x r θ φ r ∂r r2 ∂r r   ey = sin θsin ϕer+cos θsin ϕeθ+cos ϕeφ + 1 ∂ sin θ ∂vr e = cos θe −sin θe r2sin θ ∂θ ∂θ z r θ  2 ∂v + 1 ∂ vr − 2 ∂ (v sin θ) − 2 φ Derivatives of the unit vectors are r2sin2 θ ∂ϕ2 r2sin θ ∂θ θ r2sin θ ∂ϕ      ∂er = e , ∂eθ = −e +e 1 ∂ r2 ∂vθ + 1 ∂ 1 ∂ (v sin θ) ∂θ θ ∂θ r θ r2 ∂r ∂r r2 ∂θ sin θ ∂θ θ

2  ∂er ∂eθ 1 ∂ v 2 ∂v 2 cot θ ∂vφ = eφ sin θ, = eφ cos θ, + θ + r − ∂ϕ ∂ϕ r2sin2 θ ∂ϕ2 r2 ∂θ r2 sin θ ∂ϕ ∂e      φ 1 ∂ ∂vφ 1 ∂ 1 ∂ = −er sin θ−eθ cos θ +e r2 + v sin θ ∂ϕ φ r2 ∂r ∂r r2 ∂θ sin θ ∂θ φ  Others 0 ∂2v + 1 φ + 2 ∂vr + 2 cot θ ∂vθ Differential operators are given by [45] r2sin2 θ ∂ϕ2 r2sin θ ∂ϕ r2 sin θ ∂ϕ ∂ 1 ∂ 1 ∂ ∇ = e +e +e , r ∂r θ r ∂θ φ r sin θ ∂ϕ ∂ψ 1 ∂ψ 1 ∂ψ 6. Ordinary Differential Equations ∇ψ = e +e +e r ∂r θ r ∂θ φ r sin θ ∂ϕ as Initial Value Problems     ∇2ψ = 1 ∂ r2 ∂ψ + 1 ∂ sin θ ∂ψ + r2 ∂r ∂r r2 sin θ ∂θ ∂θ A differential equation for a function that de- 2 + 1 ∂ ψ pends on only one variable (often time) is called r2sin2 θ ∂ϕ2 an ordinary differential equation. The general

∇∇·v = 1 ∂ r2 v + 1 ∂ (v sin θ)+ solution to the differential equation includes 2 ∂r r r sin θ ∂θ θ r many possibilities; the boundary or initial con- + 1 ∂vφ r sin θ ∂ϕ ditions are required to specify which of those   are desired. If all conditions are at one point, ∇∇×v = e 1 ∂ v sin θ − 1 ∂vθ + r r sin θ ∂θ φ r sin θ ∂ϕ the problem is an initial value problem and can   be integrated from that point on. If some of the +e 1 ∂vr − 1 ∂ rv + θ r sin θ ∂ϕ r ∂r φ conditions are available at one point and others   at another point, the ordinary differential equa- +e 1 ∂ (rv ) − 1 ∂vr φ r ∂r θ r ∂θ tions become two-point boundary value prob- lems, which are treated in Chapter 7. Initial value  ∇∇·τ = e 1 ∂ r2τ problems as ordinary differential equations arise r r2 ∂r rr  in control of lumped-parameter models, tran- + 1 ∂ (τ sin θ)+ 1 ∂τφr − τθθ+τφφ + r sin θ ∂θ θr r sin θ ∂ϕ r sient models of stirred tank reactors, polymer-  ization reactions and plug-flow reactors, and +e 1 ∂ r3τ + 1 ∂ (τ sin θ)+ θ r3 ∂r rθ r sin θ ∂θ θθ generally in models where no spatial gradients  occur in the unknowns. 1 ∂τφθ + τθr−τrθ−τφφcotθ + r sin θ ∂ϕ r  +e 1 ∂ r3τ + 1 ∂ τ sin θ + φ r3 ∂r rφ r sin θ ∂θ θφ  1 ∂τφφ + τφr −τrφ+τφθ cotθ r sin θ ∂ϕ r 50 Mathematics in Chemical Engineering

6.1. Solution by Quadrature that can be used to turn the left-hand side into an exact differential and can be found by using When only one equation exists, even if it is non- Frechet´ differentials [51]. In this case,        linear, solving it by quadrature may be possible. Ft dc F d Ft exp + c = exp c For V dt V dt V dy = f (y) dt Thus, the differential equation can be written as        d Ft Ft F y (0) = y0 exp c = exp c dt V V V in the problem can be separated This can be integrated once to give dy =dt   t   f (y) Ft F Ft exp c = c (0) + exp c t dt V V V in and integrated: 0 y t or dy = dt = t   c (t) = exp − Ft c f (y ) V 0 y 0 0   t  F (t−t) If the quadrature can be performed analytically + F exp − c (t)dt V V in then the exact solution has been found. 0 For example, consider the kinetics problem If the integral on the right-hand side can be calcu- with a second-order reaction. lated, the solution can be obtained analytically. dc If not, the numerical methods described in the = − kc2,c (0) = c dt 0 next sections can be used. Laplace transforms To find the function of the concentration versus can also be attempted. However, an analytic so- time, the variables can be separated and inte- lution is so useful that quadrature and an inte- grated. grating factor should be tried before resorting to numerical solution. dc = − kdt, c2

− 1 = − kt+D c Application of the initial conditions gives the solution: 1 1 = kt+ c c0 For other ordinary differential equations an is useful. Consider the prob- lem governing a stirred tank with entering fluid having concentration cin and flow rate F,as shown in Figure 18. The flow rate out is also Figure 18. Stirred tank F and the volume of the tank is V. If the tank is completely mixed, the concentration in the tank is c and the concentration of the fluid leaving the tank is also c. The differential equation is then 6.2. Explicit Methods dc V = F (c −c) ,c(0) = c Consider the ordinary differential equation d t in 0 dy = f (y) Upon rearrangement, dt dc F F Multiple equations that are still initial value + c = cin dt V V problems can be handled by using the same tech- is obtained. An integrating factor is used to solve niques discussed here. A higher order differen- this equation. The integrating factor is a function tial equation Mathematics in Chemical Engineering 51   y(n)+F y(n−1),y(n−2),...,y,y =0 Adams – Bashforth Methods. The second- order Adams – Bashforth method is with initial conditions ! " n+1 n ∆ t n n−1 G y(n−1) (0) ,y(n−2) (0) ,. . .,y (0) ,y (0) =0 y = y + 3 f (y ) − f y i 2 i =1,. . .,n The fourth-order Adams – Bashforth method is ∆ t ! can be converted into a set of first-order equa- yn+1 = yn+ 55 f (yn) − 59 f yn−1 24 tions. By using " +37 f yn − 2 − 9 f yn−3 d(i−1) y d dy y ≡y(i−1) = = y(i−2) = i−1 i dt(i−1) dt dt Notice that the higher order explicit methods re- the higher order equation can be written as a set quire knowing the solution (or the right-hand of first-order equations: side) evaluated at times in the past. Because these were calculated to get to the current time, dy1 = y dt 2 this presents no problem except for starting the evaluation. Then, Euler’s method may have to be dy2 = y dt 3 used with a very small step size for several steps dy3 = y to generate starting values at a succession of time dt 4 ... points. The error terms, order of the method, dyn = − F (yn−1,yn−2,. . .,y2, y1) function evaluations per step, and stability limi- dt tations are listed in Table 6. The advantage of the The initial conditions would have to be specified fourth-order Adams – Bashforth method is that for variables y1(0),...,yn (0), or equivalently y it uses only one function evaluation per step and (0),...,y(n−1) (0). The set of equations is then yet achieves high-order accuracy. The disadvan- written as tage is the necessity of using another method to dy start. = f (y,t) dt Runge – Kutta Methods. Runge – Kutta All the methods in this chapter are described for methods are explicit methods that use several a single equation; the methods apply to the multi- function evaluations for each time step. The ple equations as well. Taking the single equation general form of the methods is in the form v dy n+1 n = f (y) y = y + wi ki dt i=1 multiplying by dt, and integrating once yields with   tn+1 t dy n+1 i−1 dt = f y t dt k =∆tf tn+c ∆t, yn + a k  dt i i ij j t tn n j=1 This is Runge – Kutta methods traditionally have been

tn+1 writen for f (t, y) and that is done here, too. If dy yn+1 = yn+ dt these equations are expanded and compared with dt t a Taylor series, restrictions can be placed on the n parameters of the method to make it first order, The last substitution gives a basis for the various second order, etc. Even so, additional param- methods. Different interpolation schemes for y eters can be chosen. A second-order Runge – (t) provide different integration schemes; using Kutta method is low-order interpolation gives low-order integra- ∆t tion schemes [3]. yn+1 = yn+ [f n+f (tn+∆t, yn+∆tf n)] 2 Euler’s method is first order The midpoint scheme is another second-order n+1 n n y = y +∆tf (y ) Runge – Kutta method: 52 Mathematics in Chemical Engineering

Table 6. Properties of integration methods for ordinary differential equations Method Error term Order Function evaluations per step Stability limit, λ ∆t≤ Explicit methods h2 Euler 2 y 1 1 2.0

5 3 Second-order Adams – Bashforth 12 h y 21

251 5 (5) Fourth-order Adams – Bashforth 720 h y 4 1 0.3 Second-order Runge – Kutta (midpoint) 2 2 2.0 Runge – Kutta – Gill 4 4 2.8 n+1 n+1 Runge – Kutta – Feldberg y −z 5 6 3.0

Predictor – corrector methods Second-order Runge – Kutta 2 2 2.0 Adams, fourth-order 2 2 1.3

Implicit methods, stability limit ∞ Backward Euler 1 many, iterative ∞ * 1 3 Trapezoid rule − 12 h y 2 many, iterative 2 * Fourth-order Adams – Moulton 4 many, iterative 3 * * Oscillation limit, λ ∆t ≤.

  ∆t ∆t yn+1 = yn+∆tf tn+ ,yn+ f n 2 2  k =∆tf tn+ ∆t ,yn− 8 k +2k − 3544 k A popular fourth-order method is the Runge – 6 2 27 1 2 2565 3  Kutta – Gill method with the formulas 1859 11 + 4104 k4− 40 k5 n n k1 =∆tf (t ,y ) n+1 n 25 1408 2197 1   y = y + 216 k1+ 2565 k3+ 4104 k4− 5 k5 n ∆t n k1 k2 =∆tf t + ,y + 2 2 zn+1 = yn+ 16 k + 6656 k   135 1 12 825 3 k =∆tf tn+ ∆t ,yn+ak +bk 3 2 1 2 28 561 9 2 + 56 430 k4− 50 k5+ 55 k6 n n k4 =∆tf (t +∆t, y +ck2+dk3) The value of yn+1 − zn+1 is an estimate of the yn+1 = yn+ 1 (k +k )+1 (bk +dk ) n+1 6 1 4 3 2 3 error in y and can be used in step-size control √ √ schemes. a = 2−1 ,b= 2− 2 , 2 2 Generally, a high-order method should be √ √ used to achieve high accuracy. The Runge – c = − 2 ,d=1+ 2 2 2 Kutta – Gill method is popular because it is high Another fourth-order Runge – Kutta method is order and does not require a starting method given by the Runge – Kutta – Feldberg formu- (as does the fourth-order Adams – Bashforth las [52]; although the method is fourth-order, it method). However, it requires four function eval- achieves fifth-order accuracy. The popular inte- uations per time step, or four times as many as gration package RKF 45 is based on this method. the Adams – Bashforth method. For problems in which the function evaluations are a signif- n n k1 =∆tf (t ,y ) icant portion of the calculation time this might   be important. Given the speed of computers and k =∆tf tn+ ∆t ,yn+ k1 2 4 4 the widespread availability of desktop comput-

n 3 n 3 9 ers, the efficiency of a method is most important k3 =∆tf t + ∆t, y + k1+ k2 8 32 32 only for very large problems that are going to

k =∆tf tn+ 12 ∆t,yn+ 1932 k − 7200 k + 7296 k be solved many times. For other problems the 4 13 2197 1 2197 2 2197 3 most important criterion for choosing a method k =∆tf tn+∆t, yn+ 439 k −8 k + 3680 k − 845 k 5 216 1 2 513 3 4104 4 is probably the time the user spends setting up the problem. Mathematics in Chemical Engineering 53

The stability of an integration method is best estimated by determining the rational polyno- mial corresponding to the method. Apply this method to the equation dy = − λy, y (0)=1 dt and determine the formula for rmn :

k+1 k y = rmn (λ ∆t) y The rational polynomial is defined as

pn (z) −z rmn (z)= ≈e qm (z) and is an approximation to exp (− z), called a Pade´ approximation. The stability limits are the Figure 19. Rational approximations for explicit methods largest positive z for which a) Euler; b) Runge – Kutta – 2; c) Runge – Kutta – Gill; d) Exact curve; e) Runge – Kutta – Feldberg |rmn (z)|≤1 The method is A acceptable if the inequality In solving sets of equations > dy holds for Re z 0. It is A (0) acceptable if the in- = Ay+f, y (0) = y equality holds for z real, z > 0 [53]. The method dt 0 will not induce oscillations about the true solu- all the eigenvalues of the matrix A must be exam- tion provided ined. Finlayson [3] and Amundson [54, p. 197 rmn (z) >0 – 199] both show how to transform these equa- tions into an orthogonal form so that each equa- A method is L acceptable if it is A acceptable tion becomes one equation in one unknown, for and which single equation analysis applies. For lin- ear problems the eigenvalues do not change, so lim rmn (z)=0 z→∞ the stability and oscillation limits must be satis- For example, Euler’s method gives fied for every eigenvalue of the matrix A. When solving nonlinear problems the equations are lin- yn+1 = yn−λ ∆tyn or yn+1 =(1−λ ∆t) yn earized about the solution at the local time, and the analysis applies for small changes in time, r =1−λ ∆t or mn after which a new analysis about the new solu- The stability limit is then tion must be made. Thus, for nonlinear problems the eigenvalues keep changing. λ ∆t ≤2 Richardson extrapolation can be used to im- The Euler method will not oscillate provided prove the accuracy of a method. Step forward one step ∆t with a p-th order method. Then redo λ ∆t ≤1 the problem, this time stepping forward from the same initial point but in two steps of length ∆t/2, The stability limits listed in Table 6 are obtained thus ending at the same point. Call the solution in this fashion. The limit for the Euler method of the one-step calculation y1 and the solution of is 2.0; for the Runge – Kutta – Gill method it is the two-step calculation y2. Then an improved 2.785; for the Runge – Kutta – Feldberg method solution at the new time is given by it is 3.020. The rational polynomials for the vari- 2p y −y ous explicit methods are illustrated in Figure 19. y = 2 1 As can be seen, the methods approximate the ex- 2p−1 act solution well as λ ∆t approaches zero, and This gives a good estimate provided ∆t is small the higher order methods give a better approxi- enough that the method is truly convergent with mation at high values of λ ∆t. 54 Mathematics in Chemical Engineering order p. This process can also be repeated in 6.3. Implicit Methods the same way Romberg’s method was used for By using different interpolation formulas, in- quadrature (see Section 2.4). n+1 The accuracy of a numerical calculation de- volving y , implicit integration methods can be derived. Implicit methods result in a nonlin- pends on the step size used, and this is chosen n+1 automatically by efficient codes. For example, in ear equation to be solved for y so that itera- the Euler method the local truncation error LTE tive methods must be used. The backward Euler is method is a first-order method:

n+1 n n+1 2 y = y +∆tf y ∆t LTE = yn 2 The trapezoid rule (see Section 2.4) is a second- Yet the second derivative can be evaluated by order method: ∆ t ! " using the difference formulas as yn+1 = yn+ f (yn)+f yn+1 2 y = ∇ (∆ty )=∆t y − y = n n n n − 1 When the trapezoid rule is used with the finite difference method for solving partial differen- ∆t (fn − fn − 1) tial equations it is called the Crank – Nicolson Thus, by monitoring the difference between the method. Adams methods exist as well, and the right-hand side from one time step to another, an fourth-order Adams – Moulton method is estimate of the truncation error is obtained. This ∆ t ! yn+1 = yn+ 9 f yn+1 +19 f (yn) error can be reduced by reducing ∆t. If the user 24 " specifies a criterion for the largest local error es- −5 f yn−1 + f yn−2 timate, then ∆t is reduced to meet that criterion. The properties of these methods are given in Ta- Also, ∆t is increased to as large a value as pos- ble 6. The implicit methods are stable for any sible, because this shortens computation time. If step size but do require the solution of a set of the local truncation error has been achieved (and nonlinear equations, which must be solved it- estimated) by using a step size ∆t1 eratively. An application to dynamic distillation p LTE = c ∆t1 problems is given in [56]. All these methods can be written in the form and the desired error is ε, to be achieved using a k k ∆ n+1 n+1−i n+1−i step size t2 y = αi y +∆t βi f y i=1 i=0 ε = c ∆tp 2 or

∆ n+1 n+1 n then the next step size t2 is taken from y =∆tβ0 f y +w   ∆t p n LTE = 1 where w represents known information. This ε ∆t2 equation (or set of equations for more than one Generally, things should not be changed too of- differential equation) can be solved by using suc- cessive substitution: ten or too drastically. Thus one may choose not   n+1,k+1 n+1,k n to increase ∆t by more than a factor (such as 2) y =∆tβ0f y +w or to increase ∆t more than once every so many steps (such as 5) [55]. In the most sophisticated Here, the superscript k refers to an iteration codes the alternative exists to change the order counter. The successive substitution method is of the method as well. In this case, the trunca- guaranteed to converge, provided the first deriva- tion error of the orders one higher and one lower tive of the function is bounded and a small enough time step is chosen. Thus, if it has not than the current one are estimated, and a choice ∆ is made depending on the expected step size and converged within a few iterations, t can be re- work. duced and the iterations begun again. The New- ton – Raphson method (see Section 1.2) can also be used. In many computer codes, iteration is allowed to proceed only a fixed number of times (e.g., Mathematics in Chemical Engineering 55 three) before ∆t is reduced. Because a good his- 6.4. Stiffness tory of the function is available from previous time steps, a good initial guess is usually possi- Why is it desirable to use implicit methods that ble. lead to sets of algebraic equations that must be The best software packages for stiff equations solved iteratively whereas explicit methods lead (see Section 6.4) use Gear’s backward differ- to a direct calculation? The reason lies in the sta- ence formulas. The formulas of various orders bility limits; to understand their impact, the con- are [57]. cept of stiffness is necessary. When modeling a

1: yn+1 = yn+∆tf yn+1 physical situation, the time constants governing different phenomena should be examined. Con- 2: yn+1 = 4 yn− 1 yn−1+ 2 ∆tf yn+1 3 3 3 sider flow through a packed bed, as illustrated in Figure 21. 3: yn+1 = 18 yn− 9 yn−1+ 2 yn−2 11 11 11

+ 6 ∆tf yn+1 11

4: yn+1 = 48 yn− 36 yn−1+ 16 yn−2− 3 yn−3 25 25 25 25

+ 12 ∆tf yn+1 25

5: yn+1 = 300 yn− 300 yn−1+ 200 yn−2 137 137 137 Figure 21. Flow through packed bed

− 75 yn−3+ 12 yn−4 137 137 The superficial velocity u is given by + 60 ∆tf yn+1 137 Q u = The stability properties of these methods are Aϕ determined in the same way as explicit methods. They are always expected to be stable, no matter where Q is the volumetric flow rate, A is the what the value of ∆t is, and this is confirmed in cross-sectional area, and ϕ is the void fraction. Figure 20. A time constant for flow through the device is then L ϕAL t = = flow u Q where L is the length of the packed bed. If a chemical reaction occurs, with a reaction rate given by

Moles = −kc Volume time where k is the rate constant (time−1) and c is the concentration (moles/volume), the characteris- tic time for the reaction is 1 t = rxn k If diffusion occurs inside the catalyst, the time Figure 20. Rational approximations for implicit methods constant is a) Backward Euler; b) Exact curve; c) Trapezoid; d) Euler εR2 tinternal diffusion = De Predictor – corrector methods can be em- ε ployed in which an explicit method is used to where is the porosity of the catalyst, R is the predict the value of yn+1. This value is then used catalyst radius, and De is the effective diffusion in an implicit method to evaluate f ( yn+1). 56 Mathematics in Chemical Engineering coefficient inside the catalyst. The time constant The idea of stiffness is best explained by con- for heat transfer is sidering a system of linear equations: 2 2 dy R ?sCsR = Ay tinternal heat transfer = = α ke dt λi A where s is the catalyst density, Cs is the cata- Let be the eigenvalues of the matrix . This n lyst heat capacity per unit mass, ke is the effec- system can be converted into a system of equa- tive thermal conductivity of the catalyst, and α tions, each of them having only one unknown; is the thermal diffusivity. The time constants for the eigenvalues of the new system are the same diffusion of mass and heat through a boundary as the eigenvalues of the original system [3, pp. layer surrounding the catalyst are 39 – 42], [54, pp. 197 – 199]. Then the stiffness ratio SR is defined as [53, p. 32] R texternal diffusion = kg maxi |Re (λi)| SR = t = >sCsR mini |Re (λi)| external heat transfer hp SR = 20 is not stiff, SR = 103 is stiff, and SR = where kg and hp are the mass-transfer and heat- 6 10 is very stiff. If the problem is nonlinear, the transfer coefficients, respectively. The impor- solution is expanded about the current state: tance of examining these time constants comes dy n ∂f from realization that their orders of magnitude i = f [y (tn)] + i [y −y (tn)] dt i ∂y j j differ greatly. For example, in the model of an j=1 j automobile catalytic converter [58] the time con- stant for internal diffusion was 0.3 s, for internal The question of stiffness then depends on the heat transfer 21 s, and for flow through the de- eigenvalue of the Jacobian at the current time. vice 0.003 s. Flow through the device is so fast Consequently, for nonlinear problems the prob- that it might as well be instantaneous. Thus, the lem can be stiff during one time period and not time derivatives could be dropped from the mass stiff during another. Packages have been devel- balance equations for the flow, leading to a set oped for problems such as these. Although the of differential-algebraic equations (see below). chemical engineer may not actually calculate the If the original equations had to be solved, the eigenvalues, knowing that they determine the eigenvalues would be roughly proportional to stability and accuracy of the numerical scheme, the inverse of the time constants. The time in- as well as the step size employed, is useful. terval over which to integrate would be a small number (e.g., five) multiplied by the longest time constant. Yet the explicit stability limitation ap- 6.5. Differential – Algebraic Systems plies to all the eigenvalues, and the largest eigen- value would determine the largest permissible Sometimes models involve ordinary differential time step. Here, λ = 1/0.003 s−1. Verysmall time equations subject to some algebraic constraints. steps would have to be used, e.g., ∆t ≤ 2 × 0.003 For example, the equations governing one equi- s, but a long integration would be required to librium stage (as in a distillation column) are n M dx = V n+1 yn+1 − Ln xn − V n yn reach steady state. Such problems are termed dt stiff, and implicit methods are very useful for them. In that case the stable time constant is not +Ln−1 xn−1 of any interest, because any time step is stable. xn−1 − xn = En xn−1 − x∗,n What is of interest is the largest step for which a N solution can be found. If a time step larger than xi =1 the smallest time constant is used, then any phe- i=1 nomena represented by that smallest time con- where x and y are the mole fractions in the liquid stant will be overlooked—at least transients in it and vapor, respectively; L and V are liquid and will be smeared over. However, the method will vapor flow rates, respectively; M is the holdup; still be stable. Thus, if the very rapid transients and the superscript n is the stage number. The of part of the model are not of interest, they can efficiency is E, and the concentration in equilib- be ignored and an implicit method used [59]. rium with the vapor is x*. The first equation is Mathematics in Chemical Engineering 57 an ordinary differential equation for the mass of When A is independent of y, the inverse (from one component on the stage, whereas the third LU decompositions) need be computed only equation represents a constraint that the mass once. fractions add to one. As a second example, the In actuality, higher order backward-differ- following kinetics problem can be considered: ence Gear methods are used in the computer program DASSL [60, 61]. dc1 = f (c ,c ) dt 1 2 Differential-algebraic systems are more com- dc2 = k c − k c2 plicated than differential systems because the so- dt 1 1 2 2 lution may not always be defined. Pontelides et The first equation could be the equation for a al. [62] introduced the term “index” to identify stirred tank reactor, for example. Suppose both possible problems. The index is defined as the k1 and k2 are large. The problem is then stiff, minimum number of times the equations must but the second equation could be taken at equi- be differentiated with respect to time to convert librium. If the system to a set ofordinary differential equa- c 2c tions. These higher derivatives may not exist, 1 2 and the process places limits on which variables The equilibrium condition is then can be given initial values. Sometimes the ini- c2 k tial values must be constrained by the algebraic 2 = 1 ≡ K c k equations [62]. For a differential-algebraic sys- 1 2 tem modeling a distillation tower, the index de- Under these conditions the problem becomes pends on the specification of pressure for the

dc1 column [62]. Several chemical engineering ex- = f (c1,c2) dt amples of differential-algebraic systems and a 2 0=klc1 − k2c2 solution for one involving two-phase flow are given in [63]. Thus, a differential-algebraic system of equa- tions is obtained. In this case, the second equa- tion can be solved and substituted into the first to 6.6. Computer Software obtain differential equations, but in the general case that is not possible. Efficient software packages are widely avail- Differential-algebraic equations can be writ- able for solving ordinary differential equations ten in the general notation as initial value problems. In each of the pack-   dy ages the user specifies the differential equation F t, y, =0 dt to be solved and a desired error criterion. The package then integrates in time and adjusts the or the variables and equations may be separated step size to achieve the error criterion, within the according to whether they come primarily from limitations imposed by stability. differential [y(t)] or algebraic equations [x(t)]: A popular explicit Runge – Kutta package is dy = f (t, y, x) , g (t, y, x)=0 RKF 45. An estimate of the truncation error at dt each step is available. Then the step size can Another form is not strictly a differential- be reduced until this estimate is below the user- algebraic set of equations, but the same prin- specified tolerance. The method is thus auto- ciples apply; this form arises frequently when matic, and the user is assured of the results. Note, the Galerkin finite element is applied: however, that the tolerance is set on the local truncation error, namely, from one step to an- dy A = f (y) other, whereas the user is generally interested dt in the global trunction error, i.e., the error af- The computer program DASSL [60, 61] can ter several steps. The global error is generally solve such problems. They can also be solved made smaller by making the tolerance smaller, by writing the differential equation as but the absolute accuracy is not the same as the dy tolerance. If the problem is stiff, then very small =A−1f (y) dt step sizes are used and the computation becomes 58 Mathematics in Chemical Engineering very lengthy. The RKF 45 code discovers this the Newton – Raphson method can be fixed over and returns control to the user with a message several time steps. Then if the iteration does not indicating the problem is too hard to solve with converge, the Jacobian can be reevaluated at the RKF 45. current time step. If the iteration still does not A popular implicit package is LSODE, a ver- converge, then the step size is reduced and a new sion of Gear’s method [57] written by Alan Jacobian is evaluated. The successive substitu- Hindmarsh at Lawrence Livermore Laboratory. tion method can also be used—wh ich is even In this package, the user specifies the differential faster, except that it may not converge. How- equation to be solved and the tolerance desired. ever, it too will converge if the time step is small Now the method is implicit and, therefore, sta- enough. ble for any step size. The accuracy may not be The Runge – Kutta methods give extremely acceptable, however, and sets of nonlinear equa- good accuracy, especially when the step size tions must be solved. Thus, in practice, the step is kept small for stability reasons. If the prob- size is limited but not nearly so much as in the lem is stiff, though, backward difference implicit Runge – Kutta methods. In these packages both methods must be used. Many chemical reactor the step size and the order of the method are ad- problems are stiff, necessitating the use of im- justed by the package itself. Suppose a k-th or- plicit methods. In the MATLAB suite of ODE der method is being used. The truncation error solvers, ode45 uses a revision of the RKF45 is determined by the (k + 1)-th order derivative. program, while the ode15s program uses an im- This is estimated by using difference formulas proved backward difference method. Ref. [64] and the values of the right-hand sides at previ- gives details of the programs in MATLAB. For- ous times. An estimate is also made for the k-th tunately, many packages are available. On the and (k + 2)-th derivative. Then, the errors in a NIST web page, http://gams.nist.gov/ choose (k − 1)-th order method, a k-th order method, “problem decision tree”, and then “differen- anda(k + 1)-th order method can be estimated. tial and integral equations” to find packages Furthermore, the step size required to satisfy the which can be downloaded. On the Netlib web tolerance with each of these methods can be de- site, http://www.netlib.org/, choose “ode” to find termined. Then the method and step size for the packages which can be downloaded. Using Mi- next step that achieves the biggest step can be crosoft Excel to solve ordinary differential equa- chosen, with appropriate adjustments due to the tions is cumbersome, except for the simplest different work required for each order. The pack- problems. age generally starts with a very small step size and a first-order method—the backward Euler method. Then it integrates along, adjusting the 6.7. Stability, Bifurcations, Limit Cycles order up (and later down) depending on the er- In this section, bifurcation theory is discussed in ror estimates. The user is thus assured that the a general way. Some aspects of this subject in- local truncation error meets the tolerance. A fur- volve the solution of nonlinear equations; other ther difficulty arises because the set of nonlinear aspects involve the integration of ordinary differ- equations must be solved. Usually a good guess ential equations; applications include chaos and of the solution is available, because the solution fractals as well as unusual operation of some is evolving in time and past history can be ex- chemical engineering equipment. An excellent trapolated. Thus, the Newton – Raphson method introduction to the subject and details needed to will usually converge. The package protects it- apply the methods are given in [65]. For more self, though, by only doing a few (i.e., three) details of the algorithms described below and iterations. If convergence is not reached within a concise survey with some chemical engineer- these iterations, the step size is reduced and the ing examples, see [66] and [67]. Bifurcation re- calculation is redone for that time step. The sults are closely connected with stability of the convergence theorem for the Newton – Raph- steady states, which is essentially a transient son method (Chap. 1) indicates that the method phenomenon. will converge if the step size is small enough. Consider the problem Thus, the method is guaranteed to work. Further ∂u = F (u, λ) economies are possible. The Jacobian needed in ∂t Mathematics in Chemical Engineering 59

Figure 22. Limit points and bifurcation – limit points A) Limit point (or turning point); B) Bifurcation-limit point (or singular turning point or bifurcation point)

∂u ss + ∂u = F (u +u,λ) The variable u can be a vector, which makes F a ∂t ∂t ss vector, too. Here, F represents a set of equations ∂F ≈ F (u ,λ)+ |u u +··· that can be solved for the steady state: ss ∂u ss F (u, λ)=0 The result is ∂u = F ss u If the Newton – Raphson method is applied, ∂t u F s δus = −F (us,λ) u A solution of the form s+1 s s u = u +δu u (x, t)=eσt X (x) is obtained, where gives s ∂F s Fu = (u ) σt ss σt ∂u σe X = Fu e X is the Jacobian. Look at some property of the so- The exponential term can be factored out and lution, perhaps the value at a certain point or the ss maximum value or an integral of the solution. (Fu − σδ) X =0 This property is plotted versus the parameter λ; A solution exists for X if and only if typical plots are shown in Figure 22. At the point ss shown in Figure 22 A, the determinant of the Ja- det |Fu − σδ| =0 cobian is zero: The σ are the eigenvalues of the Jacobian. det Fu =0 Now clearly if Re (σ) > 0 then u grows with For the limit point, time, and the steady solution uss is said to be σ ∂F unstable to small disturbances. If Im ( )=0it =0 ∂λ is called stationary instability, and the distur- bance would grow monotonically, as indicated whereas for the bifurcation-limit point in Figure 23 A. If Im (σ) = 0 then the distur- ∂F =0 bance grows in an oscillatory fashion, as shown ∂λ in Figure 23 B, and is called oscillatory insta- The stability of the steady solutions is also of bility. The case in which Re (σ) = 0 is the di- interest. Suppose a steady solution uss; the func- viding point between stability and instability. If tion u is written as the sum of the known steady Re (σ) = 0 and Im (σ) = 0—the point governing state and a perturbation u: the onset of stationary instability—then σ =0. However, this means that σ = 0 is an eigenvalue u = u +u ss of the Jacobian, and the determinant of the Jaco- This expression is substituted into the original bian is zero. Thus, the points at which the deter- equation and linearized about the steady-state minant of the Jacobian is zero (for limit points value: 60 Mathematics in Chemical Engineering

Figure 23. Stationary and oscillatory instability A) Stationary instability; B) Oscillatory instability

0 and bifurcation-limit points) are the points gov- u = u0+uλ (λ − λ0) erning the onset of stationary instability. When σ σ = and apply Newton – Raphson with this initial Re ( )=0butIm( ) 0, which is the onset of guess and the new value of λ. This will be a oscillatory instability, an even number of eigen- much better guess of the new solution than just values pass from the left-hand complex plane to u0 by itself. the right-hand complex plane. The eigenvalues Even this method has difficulties, however. are complex conjugates of each other (a result of Near a limit point the determinant of the Jaco- the original equations being real, with no com- bian may be zero and the Newton method may plex numbers), and this is called a Hopf bifurca- fail. Perhaps no solutions exist at all for the cho- tion. Numerical methods to study Hopf bifurca- sen parameter λ near a limit point. Also, the tion are very computationally intensive and are ability to switch from one solution path to an- not discussed here [65]. other at a bifurcation-limit point is necessary. To return to the problem of solving for the Thus, other methods are needed as well: arc- steady-state solution: near the limit point or length continuation and pseudo-arc-length con- bifurcation-limit point two solutions exist that tinuation [66]. These are described in Chapter 1. are very close to each other. In solving sets of equations with thousands of unknowns, the dif- ficulties in convergence are obvious. For some dependent variables the approximation may be 6.8. Sensitivity Analysis converging to one solution, whereas for another set of dependent variables it may be converging Often, when solving differential equations, the to the other solution; or the two solutions may solution as well as the sensitivity of the solu- all be mixed up. Thus, solution is difficult near tion to the value of a parameter must be known. a bifurcation point, and special methods are re- Such information is useful in doing parameter quired. These methods are discussed in [66]. estimation (to find the best set of parameters for The first approach is to use natural continu- a model) and in deciding whether a parameter needs to be measured accurately. The differen- ation (also known as Euler – Newton continua- α α tion). Suppose a solution exists for some param- tial equation for y (t, ) where is a parameter, is eter λ. Call the value of the parameter λ0 and dy the corresponding solution u0. Then = f (y, α) ,y(0) = y dt 0 F (u0 ,λ0)=0 If this equation is differentiated with respect to Also, compute uλ as the solution to α, then because y is a function of t and α ss F uλ = −Fλ   u ∂ dy ∂f ∂y ∂f λ = + at this point [ 0, u0]. Then predict the starting ∂α dt ∂y ∂α ∂α guess for another λ using Mathematics in Chemical Engineering 61

Exchanging the order of differentiation in the forces are calculated to move from one time to first term leads to the ordinary differential equa- another time. Rewrite this equation in the form tion of an acceleration.   d ∂y ∂f ∂y ∂f d2r 1 = + i = F ({r}) ≡a 2 i i dt ∂α ∂y ∂α ∂α dt mi The initial conditions on ∂y/∂α are obtained by In the Verlot method, this equation is written us- differentiating the initial conditions ing central finite differences (Eq. 12). Note that ∂ ∂y the accelerations do not depend upon the veloc- [y (0,α)=y0] , or (0)=0 ∂α ∂α ities. 2 Next, let ri (t+∆t)=2ri (t) −ri (t−∆t)+ai (t)∆t ∂y y = y, y = The calculations are straightforward, and no ex- 1 2 ∂α plicit velocity is needed. The storage require- and solve the set of ordinary differential equa- ment is modest, and the precision is modest (it tions is a second-order method). Note that one must start the calculation with values of {r} at time t dy1 = f (y ,α) y (0) = y dt 1 1 0 and t−∆t. In the Velocity Verlot method, an equation is dy2 = ∂f (y ,α) y + ∂f y (0) = 0 dt ∂y 1 2 ∂α 2 written for the velocity, too.

y t, α dvi Thus, the solution ( ) and the derivative with =a respect to α are obtained. To project the impact dt i α α α of , the solution for = 1 can be used: The trapezoid rule (see page 18) is applied to y (t, α)=y (t, α )+∂y (t, α )(α−α )+··· obtain 1 1 ∂α 1 1 1 v (t+∆t)=v (t)+ [a (t)+a (t+∆t)] ∆t = y1 (t, α1)+y2 (t, α1)(α−α1)+··· i i 2 i i This is a convenient way to determine the sensi- The position of the particles is expanded in a tivity of the solution to parameters in the prob- Taylor series. lem. 1 r (t+∆t)=r (t)+v ∆t+ a (t)∆t2 i i i 2 i Beginning with values of {r} and {v} at time 6.9. Molecular Dynamics zero, one calculates the new positions and then (→ Molecular Dynamics Simulations) the new velocities. This method is second order in ∆t, too. For additional details, see [68 – 72]. Special integration methods have been devel- oped for molecular dynamics calculations due to the structure of the equations. A very large 7. Ordinary Differential Equations number of equations are to be integrated, with the following form based on molecular interac- as Boundary Value Problems tions between molecules Diffusion problems in one dimension lead to 2 d ri boundary value problems. The boundary con- mi =Fi ({r}) ,Fi ({r})=−∇V dt2 ditions are applied at two different spatial loca- where mi is the mass of the i-th particle, ri is tions: at one side the concentration may be fixed the position of the i-th particle, Fi is the force and at the other side the flux may be fixed. Be- acting on the i-th particle, and V is the poten- cause the conditions are specified at two differ- tial energy that depends upon the location of all ent locations the problems are not initial value the particles (but not their velocities). Since the in character. To begin at one position and inte- major part of the calculation is in the evalua- grate directly is impossible because at least one tion of the forces, or potentials, a method must of the conditions is specified somewhere else and be used that minimizes the number of times the not enough conditions are available to begin the 62 Mathematics in Chemical Engineering calculation. Thus, methods have been developed where v is the velocity and η the viscosity. Then especially for boundary value problems. Exam- the variables can be separated again and the re- ples include heat and mass transfer in a slab, sult integrated to give reaction – diffusion problems in a porous cata- ∆p r2 −ηv = − +c lnr+c lyst, reactor with axial dispersion, packed beds, L 4 1 2 and countercurrent heat transfer. Now the two unknowns must be specified from the boundary conditions. This problem is a two- point boundary value problem because one of 7.1. Solution by Quadrature the conditions is usually specified at r = 0 and the other at r = R, the tube radius. However, the When only one equation exists, even if it is non- technique of separating variables and integrating linear, it may possibly be solved by quadrature. works quite well. For dy = f (y) dt

y (0) = y0 the problem can be separated dy =dt f (y) and integrated Figure 24. Flow in pipe y t dy = dt = t f (y) When the fluid is non-Newtonian, it may not y0 0 be possible to do the second step analytically. For example, for the Bird – Carreau fluid [74, p. If the quadrature can be performed analytically, 171], stress and velocity are related by the exact solution has been found. η0 As an example, consider the flow of a non- τ =    2 (1−n)/2 Newtonian fluid in a pipe, as illustrated in Figure 1+λ dv dr 24. The governing differential equation is [73] 1 d ∆p where η0 is the viscosity at v = 0 and λ the time (rτ)=− r dr L constant. Putting this value into the equation for stress where r is the radial position from the center of as a function of r gives the pipe, τ is the shear stress, ∆ p is the pres- η0 ∆p r c1 sure drop along the pipe, and L is the length over   = − +  2 (1−n)/2 L 2 r which the pressure drop occurs. The variables 1+λ dv are separated once dr ∆p This equation cannot be solved analytically for d(rτ)=− rdr L dv/dr, except for special values of n. For prob- lems such as this, numerical methods must be and then integrated to give used. ∆p r2 rτ = − +c L 2 1 Proceeding further requires choosing a consti- 7.2. Initial Value Methods tutive relation relating the shear stress and the velocity gradient as well as a condition specify- An initial value method is one that utilizes the ing the constant. For a Newtonian fluid techniques for initial value problems but al- lows for an iterative calculation to satisfy all dv τ = −η the boundary conditions. Suppose the nonlinear dr boundary value problem Mathematics in Chemical Engineering 63   d2y dy = f x,y, Packages to solve boundary value problems 2 dx dx are available on the internet. On the NIST web with the boundary conditions page, http://gams.nist.gov/ choose “problem de- cision tree”, and then “differential and integral a y (0) −a dy (0) = α, a ≥0 0 1 dx i equations”, then “ordinary differential equa- dy b0 y (1) −b1 (1) = β, bi≥0 tions”, “multipoint boundary value problems”. dx On the Netlib web site, http://www.netlib.org/, Convert this second-order equation into two search on “boundary value problem”. Any first-order equations along with the boundary spreadsheet that has an iteration capability can conditions written to include a parameter s. be used with the finite difference method. Some du = v packages for partial differential equations also dx have a capability for solving one-dimensional dv = f (x,u,v) boundary value problems [e.g., Comsol Multi- dx physics (formerly FEMLAB)]. u (0) = a1s−c1α

v (0) = a0s−c0α 7.3. Finite Difference Method The parameters c0 and c1 are specified by the analyst such that To apply the finite difference method, we first spread grid points through the domain. Figure a1c0−a0c1 =1 25 shows a uniform mesh of n points (nonuni- form meshes are also possible). The unknown, This ensures that the first boundary condition here c (x), at a grid point xi is assigned the sym- is satisfied for any value of parameter s. If the bol ci = c (xi ). The finite difference method can proper value for s is known, u (0) and u (0) can be derived easily by using a Taylor expansion of be evaluated and the equation integrated as an the solution about this point. initial value problem. The parameter s should   dc  d2c  ∆x2 ci+1 = ci+  ∆x+ 2  +... be chosen iteratively so that the last boundary dx i dx i 2 condition is satisfied.   (8) dc  d2c  ∆x2 ci−1 = ci−  ∆x+ 2  −... The model for a chemical reactor with axial dx i dx i 2 diffusion is These formulas can be rearranged and divided 2 1 dc − dc = DaR (c) by ∆x to give Pe dz2 dz   2 dc  ci+1−ci d c  ∆x 1 dc dc  = −  +... − (0) +c (0) = cin, (1)=0  2  (9) Pe dz dz dx i ∆x dx i 2   dc  c −c d2c  ∆x where Pe is the Peclet´ number and Da the  = i i−1 −  +...  2  (10) Damkohler¨ number. dx i ∆x dx i 2 The boundary conditions are due to Danckw- which are representations of the first deriva- erts [75] and to Wehner and Wilhelm [76]. This tive. Alternatively the two equations can be sub- problem can be treated by using initial value tracted from each other, rearranged and divided methods also, but the method is highly sensi- by ∆x to give tive to the choice of the parameter s, as out-   2 2 dc  ci+1−ci−1 d c  ∆x lined above. Starting at z = 0 and making small  = −  (11) dx  2∆x dx3  3! changes in s will cause large changes in the solu- i i tion at the exit, and the boundary condition at the If the terms multiplied by ∆x or ∆x2 are ne- exit may be impossible to satisfy. By starting at glected, three representations of the first deriva- z = 1, however, and integrating backwards, the tive are possible. In comparison with the Taylor process works and an iterative scheme converges series, the truncation error in the first two expres- in many cases [77]. However, if the problem is sions is proportional to ∆x, and the methods are extremely nonlinear the iterations may not con- said to be first order. The truncation error in the verge. In such cases, the methods for boundary last expression is proportional to ∆x2, and the value problems described below must be used. method is said to be second order. Usually, the 64 Mathematics in Chemical Engineering last equation is chosen to ensure the best accu- Four times Equation 8 minus this equation, with racy. rearrangement, gives   dc  −3ci+4ci+1−ci+2 2  = +O ∆x dx i 2∆x Thus, for the first derivative at point i =1   dc  −3ci+4c2−c3  = (15) Figure 25. Finite difference mesh; ∆x uniform dx i 2∆x This one-sided difference expression uses only the points already introduced into the domain. The finite difference representation of the The third alternative is to add a false point, second derivative can be obtained by adding the outside the domain, as c0= c (x = − ∆x). Then two expressions in Equation 8. Rearrangement the centered first derivative, Equation 11, can be and division by ∆x2 give   used: 2 4 2  d c  ci+1−2ci+ci−1 d c  ∆x  = −  +... dc  c2−c0 2  2 4  (12)  = dx i ∆x dx i 4!  dx 1 2∆x 2 The truncation error is proportional to ∆x . Because this equation introduces a new variable, To see how to solve a differential equation, another equation is required. This is obtained by consider the equation for convection, diffusion, also writing the differential equation (Eq. 13), and reaction in a tubular reactor: for i =1. 1 d2c dc The same approach can be taken at the other − = Da R (c) Pe dx2 dx end. As a boundary condition, any of three choices can be used: To evaluate the differential equation at the i-th   c −c dc  = n n−1 grid point, the finite difference representations dx n ∆x  of the first and second derivatives can be used to  c −4c +3c dc  = n−2 n−1 n give dx n 2∆x   c −c 1 ci+1−2ci+ci−1 ci+1−ci−1 dc = n+1 n−1 − = Da R (13) dx  2∆x Pe ∆x2 2∆x n 2 This equation is written for i =2ton − 1 (i.e., The last two are of order ∆x and the last one the internal points). The equations would then would require writing the differential equation be coupled but would involve the values of c1 (Eq. 13) for i = n, too. and cn , as well. These are determined from the Generally, the first-order expression for the boundary conditions. boundary condition is not used because the er- If the boundary condition involves a deriva- ror in the solution would decrease only as ∆x, tive, the finite difference representation of it and the higher truncation error of the differential 2 must be carefully selected; here, three possibili- equation (∆x ) would be lost. For this problem ties can be written. Consider a derivative needed the boundary conditions are at the point i = 1. First, Equation 9 could be used 1 dc − (0) +c (0) = cin to write Pe dx  dc  (1) = 0 dc  c2−c1 dx  = (14) dx 1 ∆x Thus, the three formulations would give first or- ∆ Then a second-order expression is obtained that der in x − 1 c2−c1 +c = c is one-sided. The Taylor series for the point ci+2 Pe ∆x 1 in is written:   cn−cn−1 =0 dc  d2c  4∆x2 ∆x ci+2 = ci+  2∆x+ 2  dx i dx i 2!  plus Equation 13 at points i = 2 through n − 1; 3  3 + d c  8∆x +... second order in ∆x, by using a three-point one- dx3 3! i sided derivative Mathematics in Chemical Engineering 65

− 1 −3c1+4c2−c3 +c = c Pe 2∆x 1 in The advantage of this approach is that it is eas- c −4c +3c ier to program than a full Newton – Raphson n−2 n−1 n =0 2∆x method. If the transport coefficients do not vary plus Equation 13 at points i = 2 through n − 1; radically, the method converges. If the method second order in ∆x, by using a false boundary does not converge, use of the full Newton – point Raphson method may be necessary. 1 c2−c0 Three ways are commonly used to evaluate − +c1 = cin Pe 2∆x the transport coefficient at the midpoint. The first cn+1−cn−1 =0 2∆x one employs the transport coefficient evaluated at the average value of the solutions on either i n plus Equation 13 at points = 1 through . side: The sets of equations can be solved by us-   ing the Newton – Raphson method, as outlined 1 D ci+1/2 ≈D (ci+1+ci) in Section 1.2. 2 Frequently, the transport coefficients (e.g., The second approach uses the average of the diffusion coefficient D or thermal conductivity) transport coefficients on either side: depend on the dependent variable (concentration 1 or temperature, respectively). Then the differen- D ci+1/2 ≈ [D (ci+1)+D (ci)] (17) tial equation might look like 2   d dc The truncation error of these approaches is also D (c) =0 2 dx dx ∆x [78, Chap. 14], [3, p. 215]. The third ap- proach employs an “upstream” transport coeffi- This could be written as cient. dJ − =0 (16) dx D Ci+1/2 ≈D (ci+1) , when D (ci+1) >D (ci) in terms of the mass flux J, where the mass flux D ci+1/2 ≈ (ci) , when D (ci+1)

N+2 Suppose the differential equation is dy (x )= A y (x ) , dx j jk k k=1 N [y]=0 N+2 A = [y (x )]−1 dyi (x ) jk i k dx j Then the expansion is put into the differential i=1 equation to form the residual: Similar steps can be applied to the second deriva- & ' tive to obtain N +2 Residual = N aiyi (x) 2 N+2 d y (x )= B y (x ) , i=1 dx2 j jk k k=1

N+2 2 In the collocation method, the residual is set to B = [y (x )]−1 d yi (x ) jk i k dx2 j zero at a set of points called collocation points: i=1 & ' N +2 This method is next applied to the differential N aiyi (xj ) =0,j=2,. . .,N+1 equation for reaction in a tubular reactor, after i=1 the equation has been made nondimensional so This provides N equations; two more equa- that the dimensionless length is 1.0. 2 tions come from the boundary conditions, giving 1 d c − dc = Da R (c) , Pe dx2 dx N + 2 equations for N + 2 unknowns. This pro- (19) − dc (0) = Pe[c (0) −c ] , dc (1)=0 cedure is especially useful when the expansion is dx in dx in a series of orthogonal polynomials, and when the collocation points are the roots to an orthog- The differential equation at the collocation points is onal polynomial, as first used by Lanczos [80, 81]. A major improvement was the proposal by 1 N +2 N +2 Bjkc (xk) − Ajkc (xk)=Da R (cj ) (20) Villadsen and Stewart [82] that the entire so- Pe k=1 k=1 lution process be done in terms of the solution at the collocation points rather than the coefficients and the two boundary conditions are in the expansion. Thus, Equation 18 would be N+2 − Alkc (xk)=Pe(c1−cin) , evaluated at the collocation points k=1 (21) N+2 N+2 A c (x )=0 y (x )= a y (x ) ,j=1,. . .,N+2 N+2,k k j i i j k=1 i=1 Note that 1 is the first collocation point (x =0) and solved for the coefficients in terms of the and N + 2 is the last one (x = 1). To apply the solution at the collocation points: method, the matrices Aij and Bij must be found N +2 and the set of algebraic equations solved, per- −1 ai = [yi (xj )] y (xj ) ,i=1,. . .,N+2 haps with the Newton – Raphson method. If or- j=1 thogonal polynomials are used and the colloca- Furthermore, if Equation 18 is differentiated tion points are the roots to one of the orthogonal once and evaluated at all collocation points, the polynomials, the orthogonal collocation method first derivative can be written in terms of the val- results. ues at the collocation points: In the orthogonal collocation method the so- lution is expanded in a series involving orthog- N+2 dy dy onal polynomials, where the polynomials Pi−1 (x )= a i (x ) ,j=1,. . .,N+2 dx j i dx j i=1 (x) are defined in Section 2.2. N This can be expressed as y = a+bx+x (1−x) aiPi−1 (x) i=1 (22) dy N+2 (xj )= dx = biPi−1 (x) i=1 N +2 dy [y (x )]−1 y (x ) i (x ) ,j=1,. . .,N+2 i k k dx j which is also i,k=1 N +2 i−1 or shortened to y = dix i=1 Mathematics in Chemical Engineering 67

d = Q−1y, dy = CQ−1y = Ay, dx (23) 2 d y = DQ−1y = By dx2 Thus the derivative at any collocation point can be determined in terms of the solution at the col- Figure 26. Orthogonal collocation points location points. The same property is enjoyed by the finite difference method (and the finite ele- ment method described below), and this prop- The collocation points are shown in Figure 26. erty accounts for some of the popularity of the There are N interior points plus one at each end, orthogonal collocation method. In applying the and the domain is always transformed to lie on 0 method to Equation 19, the same result is ob- to 1. To define the matrices Aij and Bij this ex- tained; Equations 20 and 21, with the matrices pression is evaluated at the collocation points; it defined in Equation 23. To find the solution at is also differentiated and the result is evaluated a point that is not a collocation point, Equation at the collocation points. 22 is used; once the solution is known at all col- N+2  i−1 location points, d can be found; and once d is y (xj )= dixj i=1 known, the solution for any x can be found. N+2 To use the orthogonal collocation method, the dy (x )= d (i−1) xi−2 dx j i j i=1 matrices are required. They can be calculated as shown above for small N (N < 8) and by us- 2 N+2 d y (x )= d (i−1) (i−2) xi−3 dx2 j i j ing more rigorous techniques, for higher N (see i=1 Chap. 2). However, having the matrices listed These formulas are put in matrix notation, where explicitly for N = 1 and 2 is useful; this is shown Q, C, and D are N +2byN + 2 matrices. in Table 7.

2 For some reaction diffusion problems, the so- y = Qd, dy = Cd, d y = Dd dx dx2 lution can be an even function of x. For example, i−1 i−2 for the problem Qji = x ,Cji =(i−1) x , j j d2c dc = kc, (0) = 0,c(1) = 1 (24) i−3 dx2 dx Dji =(i−1) (i−2) xj the solution can be proved to involve only even In solving the first equation for d, the first and powers of x. In such cases, an orthogonal col- second derivatives can be written as location method, which takes this feature into

Table 7. Matrices for orthogonal collocation 68 Mathematics in Chemical Engineering account, is convenient. This can easily be done orthogonal collocation is applied at the interior by using expansions that only involve even pow- points ers of x. Thus, the expansion N +1 B c = ϕ2R (c ) ,j=1,. . .,N N ji i j 2 2 2 i=1 y x = y (1) + 1−x aiPi−1 x i=1 and the boundary condition solved for is is equivalent to cN+1 =1 N +1 N +1 2 2 2i−2 y x = biPi−1 x = dix The boundary condition at x = 0 is satisfied auto- i=1 i=1 matically by the trial function. After the solution has been obtained, the effectiveness factor η is The polynomials are defined to be orthogonal 2 obtained by calculating with the weighting function W (x ). 1 N+1 1 R [c (x)] xa−1dx W R (c ) W x2 P x2 P x2 xa−1dx =0 j j k m η ≡ 0 = i=1 0 (25) 1 N+1 a−1 k≤m−1 R [c (1)] x dx Wj R (1) 0 i=1 a−1 where the power on x defines the geometry Note that the effectiveness factor is the average as planar or Cartesian (a = 1), cylindrical (a = reaction rate divided by the reaction rate eval- 2), and spherical (a = 3). An analogous devel- uated at the external conditions. Error bounds opment is used to obtain the (N +1)×(N +1) have been given for linear problems [83, p. 356]. matrices For planar geometry the error is N+1 2(2N+1)  2i−2 ϕ y (xj )= dixj Error in η = i=1 (2N+1)!(2N+2) ! N+1 dy  2i−3 (xj )= di (2i−2) x This method is very accurate for small N (and dx j 2 i=1  small ϕ ); note that for finite difference methods N+1  2 2 2 2i−2  the error goes as 1/N , which does not decrease ∇ y (xi)= di∇ x  i=1  as rapidly with N. If the solution is desired at the xj y = Qd, dy = Cd,∇2y = Dd center (a frequent situation because the center dx concentration can be the most extreme one), it 2i−2 2i−3 Qji = xj ,Cji =(2i−2) xj , is given by N+1 D = ∇2 x2i−2 | ! " ji xj c (0) = d Q−1 y 1 1i i i=1 d = Q−1y, dy = CQ−1y = Ay, dx The collocation points are listed in Table 8. ∇2y = DQ−1y = By For small N the results are usually more accu- rate when the weighting function in Equation In addition, the quadrature formula is 2 25 is 1 − x . The matrices for N = 1 and N = WQ= f, W = fQ−1 2 are given in Table 9 for the three geometries. Computer programs to generate matrices and a where program to solve reaction diffusion problems, 1 N+1 OCRXN, are available [3, p. 325, p. 331]. x2i−2xa−1dx = W x2i−2 j j Orthogonal collocation can be applied to dis- 0 j=1 tillation problems. Stewart et al. [84, 85] devel- = 1 ≡f 2i−2+a i oped a method using Hahn polynomials that re- As an example, for the problem tains the discrete nature of a plate-to-plate distil-   lation column. Other work treats problems with 1 d xa−1 dc = ϕ2R (c) xa−1 dx dx multiple liquid phases [86]. Some of the appli- cations to chemical engineering can be found in dc (0)=0,c(1)=1 dx [87 – 90]. Mathematics in Chemical Engineering 69

Table 8. Collocation points for orthogonal collocation with symmetric polynomials and W=1 Geometry N Planar Cylindrical Spherical 1 0.5773502692 0.7071067812 0.7745966692 2 0.3399810436 0.4597008434 0.5384693101 0.8611363116 0.8880738340 0.9061793459 3 0.2386191861 0.3357106870 0.4058451514 0.6612093865 0.7071067812 0.7415311856 0.9324695142 0.9419651451 0.9491079123 4 0.1834346425 0.2634992300 0.3242534234 0.5255324099 0.5744645143 0.6133714327 0.7966664774 0.8185294874 0.8360311073 0.9602898565 0.9646596062 0.9681602395 5 0.1488743390 0.2165873427 0.2695431560 0.4333953941 0.4803804169 0.5190961292 0.6794095683 0.7071067812 0.7301520056 0.8650633667 0.8770602346 0.8870625998 0.9739065285 0.9762632447 0.9782286581

7.5. Orthogonal Collocation on Finite in the last element. The orthogonal collocation Elements method is applied at each interior collocation point. In the method of orthogonal collocation on fi- NP NP 1 B c + a−1 1 A c = nite elements, the domain is first divided into ∆x2 IJ J x +u ∆x ∆x IJ J k J=1 (k) I k k J=1 elements, and then within each element orthog- = ϕ2R (c ) ,I=2,. . .,N P −1 onal collocation is applied. Figure 27 shows the J domain being divided into NE elements, with The local points i =2,...,NP − 1 represent NCOL interior collocation points within each el- the interior collocation points. Continuity of the ement, and NP = NCOL + 2 total points per el- function and the first derivative between ele- ement, giving NT=NE*(NCOL + 1) + 1 total ments is achieved by taking  number of points. Within each element a local  NP  coordinate is defined 1 A c ∆x NP,J J  k−1 J=1  x−x  element k−1 (k)  u = , ∆xk = x(k+1)−x(k) NP  ∆xk = 1 A c ∆x 1,J J  k J=1  The reaction – diffusion equation is written as element k   1 d dc d2c a−1 dc at the points between elements. Naturally, the xa−1 = + = ϕ2R (c) xa−1 dx dx dx2 x dx computer code has only one symbol for the so- lution at a point shared between elements, but and transformed to give the derivative condition must be imposed. Fi- 1 d2 a−1 1 dc nally, the boundary conditions at x = 0 and x = + = ϕ2R (c) 2 2 1 are applied: ∆xk du x(k)+u∆xk ∆xk du 1 NP The boundary conditions are typically A c =0, ∆x 1,J J dc dc k J=1 (0) = 0, − (1) = Bim [c (1) −cB ] dx dx in the first element; where Bim is the Biot number for mass transfer. 1 NP These become − ANP,JcJ = Bim [cNP −cB ] , ∆xNE 1 dc J=1 (u =0)=0, ∆x1 du in the last element. in the first element; These equations can be assembled into an overall matrix problem 1 dc − (u =1)=Bim [c (u =1)−cB ] , ∆xNE du AAc= f 70 Mathematics in Chemical Engineering

Table 9. Matrices for orthogonal collocation with symmetric polynomials and W=1−x2

The form of these equations is special and is dis- 337]. See also the program COLSYS described cussed by Finlayson [3, p. 116], who also gives below. the computer code to solve linear equations aris- The error bounds of DeBoor [91] give the fol- ing in such problems. Reaction – diffusion prob- lowing results for second-order problems solved lems are solved by the program OCFERXN [3, p. with cubic trial functions on finite elements with Mathematics in Chemical Engineering 71

Figure 27. Grid for orthogonal collocation on finite elements

1 NT   continuous first derivatives. The error at all po- b (x) a 1 d xa−1 dbi j i xa−1 dx dx sitions is bounded by 0 & i=1 ' NT i 2 a−1 d 2 −ϕ R aibi (x) x dx =0  (y−yexact)  ≤constant|∆x| i=1 dxi ∞ j =1,. . .,N T The error at the collocation points is more accu- rate, giving what is known as superconvergence. This process makes the method a Galerkin   method. The basis for the orthogonality condi-  i  d 4 tion is that a function that is made orthogonal to  (y−y ) ≤constant|∆x|  dxi exact  collocation points each member of a complete set is then zero. The residual is being made orthogonal, and if the ba- sis functions are complete, and an infinite num- ber of them are used, then the residual is zero. 7.6. Galerkin Finite Element Method Once the residual is zero the problem is solved. It is necessary also to allow for the boundary In the finite element method the domain is di- conditions. This is done by integrating the first vided into elements and an expansion is made term of Equation 26 by parts and then inserting for the solution on each finite element. In the the boundary conditions: Galerkin finite element method an additional 1   idea is introduced: the Galerkin method is used b (x) 1 d xa−1 dbi xa−1dx = j xa−1 dx dx to solve the equation. The Galerkin method is 0 1   1 explained before the finite element basis set is   db d b (x) xa−1 dbi dx− j dbi xa−1dx introduced. dx j dx dx dx 0 0 (27)   1 To solve the problem 1  db = b (x) xa−1 dbi − j dbi xa−1dx   j dx dx dx 1 d xa−1 dc = ϕ2 R (c) 0 0 a−1 dx dx 1 x  db = − j dbi xa−1dx−Bi b (1) [b (1) −c ] dx dx m j i B dc (0)=0, − dc (1) = Bi [c (1) −c ] 0 dx dx m B the unknown solution is expanded in a series of Combining this with Equation 26 gives 1 known functions {bi (x)}, with unknown coef- NT  db − j dbi xa−1dxa { } dx dx i ficients ai . i=1 0 & ' NT NT −Bi b (1) a b (1) −c c (x)= a b (x) m j i i B i i &i=1 ' (28) i=1 1 NT 2 a−1 = ϕ bj (x) aibi (x) x dx The series (the trial solution) is inserted into the 0 i=1 differential equation to obtain the residual: j =1,. . .,NT NT   This equation defines the Galerkin method, and = a 1 d xa−1 dbi Residual i xa−1 dx dx i=1 a solution that satisfies this equation (for all j = & ' NT 1,...,∞) is called a weak solution. For an ap- 2 −ϕ R aibi (x) proximate solution the equation is written once i=1 for each member of the trial function, j =1,... The residual is then made orthogonal to the set , NT. If the boundary condition is of basis functions. c (1) = cB 72 Mathematics in Chemical Engineering

1 e 1 dNJ dNI a−1 then the boundary condition is used (instead of B = − (xe+u∆xe) du, JI ∆xe du du Eq. 28) for j = NT, 0 & ' 1 NP NT e 2 e FJ = ϕ ∆xe NJ (u) R cI NI (u) aibi (1) = cB 0 I=1 i=1 a−1 (xe+u∆xe) du The Galerkin finite element method results whereas the boundary element integrals are when the Galerkin method is combined with BBe = −Bi N (1) N (1) , a finite element trial function. Both linear and JI m J I quadratic finite element approximations are de- e FF = −BimNJ (1) c1 scribed in Chapter 2. The trial functions bi (x) J are then generally written as Ni (x). Then the entire method can be written in the NT compact notation c (x)= ciNi (x) Be ce+ BBe ce = F e+ FFe i=1 e JI I e JI I e J e J Each Ni (x) takes the value 1 at the point xi and The matrices for various terms are given in Ta- zero at all other grid points (Chap. 2). Thus ci ble 10. This equation can also be written in the are the nodal values, c (xi )=ci . The first deriva- form tive must be transformed to the local coordinate AAc = f system, u = 0 to 1 when x goes from xi to xi + ∆ x. where the matrix AA is sparse. If linear elements dNj 1 dNJ are used the matrix is tridiagonal. If quadratic = , dx =∆xedu dx ∆xe du elements are used the matrix is pentadiagonal. in the e-th element. Then the Galerkin method Naturally the linear algebra is most efficiently is carried out if the sparse structure is taken into account. Once the solution is found the solution  NP 1 1 dNJ dNI a−1 e at any point can be recovered from − (xe+u∆xe) duc e ∆xe du du I I=1 0 & ' ce (u)=ce (1−u)+ce u  NP I=1 I=2 e −Bim eNJ (1) cI NI (1) −c1 I=1 (29) & ' for linear elements  1 NP 2 e ce (u)=ce 2(u−1) u− 1 = ϕ e∆xe NJ (u) R cI NI (u) I=1 2 0 I=1 a−1 +ce 4u (1−u)+ce 2u u− 1 (xe+u∆xe) du I=2 I=3 2 The element integrals are defined as for quadratic elements

Table 10. Element matrices for Galerkin method Mathematics in Chemical Engineering 73

3 i 3 i+1 a = 1 d c ,a = 1 d c Because the integrals in Equation 28 may be i 3 du3 i+1 3 du3 ∆xi ∆xi+1 complicated, they are usually formed by using   (4) 1 ai−ai−1 ai+1−ai Gaussian quadrature. If NG Gauss points are u i≈ 1 + 1 2 (xi+1−xi−1) (xi+2−xi) used, a typical term would be 2 2 & ' 1 NP Element sizes are then chosen so that the follow- e a−1 NJ (u) R cI NI (u) (xe+u∆xe) du ing error bounds are satisfied 0 I=1 & ' NG NP 4 (4) e C∆xi u i≤ε for all i = WkNJ (uk) R cI NI (uk) k=1 I=1 a−1 (xe+uk∆xe) These features are built into the code COLSYS (http://www.netlib.org/ode/). The error expected from a method one order higher and one order lower can also be defined. 7.7. Cubic B-Splines Then a decision about whether to increase or decrease the order of the method can be made Cubic B-splines have cubic approximations by taking into account the relative work of the within each element, but first and second deriva- different orders. This provides a method of ad- tives continuous between elements. The func- justing both the mesh spacing (∆x, sometimes tions are the same ones discussed in Chapter 2, called h) and the degree of polynomial (p). Such and they can be used to solve differential equa- methods are called h–pmethods. tions, too. See Sincovec [92].

7.9. Comparison 7.8. Adaptive Mesh Strategies What method should be used for any given prob- In many two-point boundary value problems, lem? Obviously the error decreases with some the difficulty in the problem is the formation of power of ∆x, and the power is higher for the a boundary layer region, or a region in which higher order methods, which suggests that the the solution changes very dramatically. In such error is less. For example, with linear elements cases small mesh spacing should be used there, the error is either with the finite difference method or the y (∆x)=y +c ∆x2 finite element method. If the region is known a exact 2 priori, small mesh spacings can be assumed at for small enough (and uniform) ∆x. A computer the boundary layer. If the region is not known code should be run for varying ∆x to confirm though, other techniques must be used. These this. For quadratic elements, the error is techniques are known as adaptive mesh tech- y (∆x)=y +c ∆x3 niques. The general strategy is to estimate the exact 3 error, which depends on the grid size and deriva- If orthogonal collocation on finite elements is tives of the solution, and refine the mesh where used with cubic polynomials, then the error is large. 4 The adaptive mesh strategy was employed by y (∆x)=yexact+c4∆x Ascher et al. [93] and by Russell and Chris- However, the global methods, not using finite el- tiansen [94]. For a second-order differential ements, converge even faster [95], for example, equation and cubic trial functions on finite el-   1 NCOL ements, the error in the i-th element is given by y (N)=y +c exact N NCOL 4 (4) Error = c∆x u  i i i Yet the workload of the methods is also differ- Because cubic elements do not have a nonzero ent. These considerations are discussed in [3]. fourth derivative, the third derivative in adjacent Here, only sweeping generalizations are given. elements is used [3, p. 166]: If the problem has a relatively smooth so- lution, then the orthogonal collocation method 74 Mathematics in Chemical Engineering is preferred. It gives a very accurate solution, mesh near the singularity, by relying on the bet- and N can be quite small so the work is small. ter approximation due to a smaller ∆x. Another If the problem has a steep front in it, the finite approach is to incorporate the singular trial func- difference method or finite element method is tion into the approximation. Thus, if the solution indicated, and adaptive mesh techniques should approaches f (x)asx goes to zero, and f (x) be- probably be employed. Consider the reaction – comes infinite, an approximation may be taken diffusion problem: as the Thiele modulus ϕ in- as creases from a small value with no diffusion lim- N itations to a large value with significant diffusion y (x)=f (x)+ aiyi (x) limitations, the solution changes as shown in i=1 Figure 28. The orthogonal collocation method This function is substituted into the differential is initially the method of choice. For interme- equation, which is solved for ai . Essentially, a diate values of ϕ, N =3–6must be used, but new differential equation is being solved for a orthogonal collocation still works well (for η new variable: down to approximately 0.01). For large ϕ, use of the finite difference method, the finite element u (x) ≡y (x) −f (x) method, or an asymptotic expansion for large ϕ is better. The decision depends entirely on the The differential equation is more complicated type of solution that is obtained. For steep fronts but has a better solution near the singularity (see the finite difference method and finite element [97, pp. 189 – 192], [98, p. 611]). method with adaptive mesh are indicated. Sometimes the domain is infinite. Boundary layer flow past a flat plate is governed by the Bla- sius equation for stream function [99, p. 117].

3 2 2 d +f d =0 dη3 dη2

f = df =0 η =0 dη at

df =1 η→∞ dη at Because one boundary is at infinity using a mesh with a constant size is difficult! One approach is to transform the domain. For example, let

z =e−η

Then η = 0 becomes z = 1 and η = ∞ becomes z = 0. The derivatives are

2 dz = −e−η = −z, d z =e−η = z dη dη2

df = df dz = −z df Figure 28. Concentration solution for different values of dη dz dη dz Thiele modulus   2 2 2 2 2 d f = d f dz + df d z = z2 d f +z df dη2 dz2 dη dz dη2 dz2 dz The Blasius equation becomes   3 2 2 −z3 d f −3z2 d f −z df 7.10. Singular Problems and Infinite dz3 dz2 dz   Domains 2 +f z2 d +z df =0 0≤z≤1. dz2 dz for If the solution being sought has a singularity, a good numerical solution may be hard to find. The differential equation now has variable coef- Sometimes even the location of the singular- ficients, but these are no more difficult to handle ity may not be known [96, pp. 230 – 238]. One than the original nonlinearities. method of solving such problems is to refine the Mathematics in Chemical Engineering 75

Another approach is to use a variable mesh, happen in the case of nonlinear problems; an ex- perhaps with the same transformation. For ex- ample is a compressible flow problem with both ample, use z =e−η and a constant mesh size in subsonic and supersonic regions. Characteristic z. Then with 101 points distributed uniformly curves are curves along which a discontinuity from z =0toz = 1, the following are the nodal can propagate. For a given set of equations, it points: is necessary to determine if characteristics ex- z =0.,0.01,0.02,...,0.99,1.0 ist or not, because that determines whether the equations are hyperbolic, elliptic, or parabolic. η = ∞,4.605,3.912,...,0.010,0 Linear Problems For linear problems, the ∆η = ∞,0.639,...,0.01 theory summarized by Joseph et al. [105] can Still another approach is to solve on a finite be used. mesh in which the last point is far enough away ∂ ∂ ∂ , ,..., that its location does not influence the solution. ∂t ∂xi ∂xn A location that is far enough away must be found by trial and error. is replaced with the Fourier variables

iξ0,iξ1,. . .,iξn 8. Partial Differential Equations If the m-th order differential equation is

α α P = aα∂ + bα∂ Partial differential equations are differential |α|=m |α|

σ2 σ2 1 + 2 =1, a2 b2 Ellipse

8.1. Classification of Equations 2 σ0 = aσ1 , Parabola A set of differential equations may be hyper- σ2−aσ2 =0, Hyperbola bolic, elliptic, or parabolic, or it may be of mixed 0 1 type. The type may change for different param- If Equation 30 has no nontrivial real zeroes then eters or in different regions of the flow. This can the equation is called elliptic. If all the roots are 76 Mathematics in Chemical Engineering real and distinct (excluding zero) then the oper- The equation is thus second order and the type ator is hyperbolic. is determined by This formalism is applied to three basic types −βσ2+σ2 =0 of equations. First consider the equation arising 0 1 from steady diffusion in two dimensions: The normalization condition ∂2c ∂2c σ2+σ2 =1 + =0 0 1 ∂x2 ∂y2 is required. Combining these gives This gives 1− (1+β) σ2 =0 0 −ξ2−ξ2 = − ξ1+ξ2 =0 1 2 2 2 The roots are real and the equation is hyperbolic. Thus, When β =0

2 2 ξ2 =0 σ1 +σ2 =1(normalization) 1 and the equation is parabolic. σ2+σ2 =0(equation) 1 2 First-order quasi-linear problems are written These cannot both be satisfied so the problem is in the form elliptic. When the equation is n A ∂u = f, x =(t,x . . .,x ) 1 ∂x 1 n ∂2u ∂2u l=0 1 (31) − =0 ∂t2 ∂x2 u =(u1,u2,. . .,uk) then The matrix entries A1 is a k×k matrix whose entries depend on u but not on derivatives of u. −ξ2+ξ2 =0 0 1 Equation 31 is hyperbolic if ξ0 Now real can be solved and the equation is A=Aµ hyperbolic is nonsingular and for any choice of real λ1, l = σ2+σ2 =1( ) 0 1 normalization 0,...,n, l = µ the roots αk of   −σ2+σ2 =0(equation) 0 1      n  When the equation is   det αA− λ1A1 =0     ∂c ∂2c ∂2c  l =0  = D + ∂t ∂x2 ∂y2 l=µ then are real. If the roots are complex the equation σ2+σ2+σ2 =1(normalization) is elliptic; if some roots are real and some are 0 1 2 complex the equation is of mixed type. 2 2 σ1 +σ2 =0(equation) Apply these ideas to the advection equation ∂u ∂u thus we get +F (u) =0 ∂t ∂x 2 σ0 =1(for normalization) Thus, and the characteristic surfaces are hyperplanes det (αA0−λ1A1)=0or det (αA1−λ0A0)=0 with t = constant. This is a parabolic case. In this case, Consider next the telegrapher’s equation: n =1, A =1, A = F (u) ∂T ∂2T ∂2T 0 1 +β = ∂t ∂t2 ∂x2 Using the first of the above equations gives

Replacing the derivatives with the Fourier vari- det (α−λ1F (u))=0, or α = λ1F (u) ables gives Thus, the roots are real and the equation is hy- 2 2 iξ0−βξ0 +ξ1 =0 perbolic. Mathematics in Chemical Engineering 77

The final example is the heat conduction where D is a diffusion coefficient. Special cases problem written as are the convective diffusive equation ∂T ∂q ∂T ∂c ∂c ∂2c ?Cp = − ,q= −k +u = D (33) ∂t ∂x ∂x ∂t ∂x ∂x2 In this formulation the constitutive equation for and Burgers viscosity equation heat flux is separated out; the resulting set of ∂u ∂u ∂2u +u = ν (34) equations is first order and written as ∂t ∂x ∂x2

∂T ∂q ν ?Cp + =0 where u is the velocity and is the kinematic vis- ∂t ∂x cosity. This is a prototype equation for the Navier k ∂T = −q → ∂x – Stokes equations ( Fluid Mechanics). For adsorption phenomena [106, p. 202], In matrix notation this is ∂c ∂c df ∂c & '& ' & '& ' & ' ∂T ∂T ϕ +ϕu +(1−ϕ) =0 (35) ?Cp 0 01 0 ∂t ∂x dc ∂t ∂t + ∂x = 00∂q k 0 ∂q −q ϕ ∂t ∂t where is the void fraction and f (c) gives the equilibrium relation between the concentrations This compares with in the fluid and in the solid phase. In these exam- ∂u ∂u ples, if the diffusion coefficient D or the kine- A0 +A1 = f ∂x0 ∂x1 matic viscosity ν is zero, the equations are hyper- bolic. If D and ν are small, the phenomenon may In this case A0 is singular whereas A1 is nonsin- be essentially hyperbolic even though the equa- gular. Thus, tions are parabolic. Thus the numerical methods det (αA1−λ0A0)=0 for hyperbolic equations may be useful even for parabolic equations. is considered for any real λ0. This gives Equations for several methods are given here,      −?C λ α  as taken from [107]. If the convective term is  p 0  =0  kα 0  treated with a centered difference expression the solution exhibits oscillations from node to node, or and these vanish only if a very fine grid is used. The simplest way to avoid the oscillations with α2k =0 a hyperbolic equation is to use upstream deriva- Thus the α is real, but zero, and the equation is tives. If the flow is from left to right, this would parabolic. give the following for Equations (40): dc F (c ) −F (c ) c −2c +c i + i i−1 = D i+1 i i−1 dt ∆x ∆x2 8.2. Hyperbolic Equations for Equation 34: du u −u u −2u +u i +u i i−1 = ν i+1 i i−1 The most common situation yielding hyperbolic dt i ∆x ∆x2 equations involves unsteady phenomena with and for Equation 35: convection. A prototype equation is dci ci−ci−1 df dci ∂c ∂F (c) ϕ +ϕui +(1−ϕ) | =0 + =0 dt ∆x dc i dt ∂t ∂x If the flow were from right to left, then the for- Depending on the interpretation of c and F (c), mula would be this can represent accumulation of mass and con- dc F (c ) −F (c ) c −2c +c i + i+1 i = D i+1 i i−1 vection. With F (c)=uc, where u is the veloc- dt ∆x ∆x2 ity, the equation represents a mass balance on concentration. If diffusive phenomenon are im- If the flow could be in either direction, a local portant, the equation is changed to determination must be made at each node i and the appropriate formula used. The effect of us- ∂c ∂F (c) ∂2c + = D (32) ing upstream derivatives is to add artificial or ∂t ∂x ∂x2 numerical diffusion to the model. This can be 78 Mathematics in Chemical Engineering ascertained by taking the finite difference form Leaving out the u2 ∆t2 terms gives the Galerkin of the convective diffusion equation method. Replacing the left-hand side with dci ci−ci−1 ci+1−2ci+ci−1 +u = D cn+1−cn dt ∆x ∆x2 i i and rearranging gives the Taylor finite difference method, and c −c 2 2 dci +u i+1 i−1 dropping the u ∆t terms in that gives the dt 2∆x   centered finite difference method. This method = D+ u∆x ci+1−2ci+ci−1 2 ∆x2 might require a small time step if reaction phenomena are important. Then the implicit Thus the diffusion coefficient has been changed Galerkin method (without the Taylor terms) is from appropriate u∆x D to D+ A stability diagram for the explicit methods 2 applied to the convective diffusion equation is Expressed in terms of the cell Peclet num- shown in Figure 29. Notice that all the methods Pe =u∆x/D ber, ∆ , this is D is changed to require D[1+Pe∆/2] u∆t The cell Peclet number should always be cal- Co = ≤1 culated as a guide to the calculations. Using a ∆x large cell Peclet number and upstream deriva- where Co is the Courant number. How much Co tives leads to excessive (and artificial) smooth- should be less than one depends on the method ing of the solution profiles. and on r = D ∆t/∆x2, as given in Figure 29. Another method often used for hyperbolic The MacCormack method with flux correction equations is the MacCormack method. This requires a smaller time step than the MacCor- method has two steps; it is written here for Equa- mack method alone (curve a), and the implicit tion 33. Galerkin method (curve e) is stable for all values   c∗n+1 = cn− u∆t cn −cn of Co and r shown in Figure 29 (as well as even i i ∆x i+1 i   larger values). + ∆tD cn −2cn+cn ∆x2 i+1 i i−1     cn+1 = 1 cn+c∗n+1 − u∆t c∗n+1−c∗n+1 i 2 i i 2∆x i i−1   + ∆tD c∗n+1−2c∗n+1+c∗n+1 2∆x2 i+1 i i−1 The concentration profile is steeper for the Mac- Cormack method than for the upstream deriva- tives, but oscillations can still be present. The flux-corrected transport method can be added to the MacCormack method. A solution is obtained both with the upstream algorithm and the Mac- Cormack method; then they are combined to add just enough diffusion to eliminate the oscilla- tions without smoothing the solution too much. The algorithm is complicated and lengthy but well worth the effort [107 – 109]. If finite element methods are used, an explicit Figure 29. Stability diagram for convective diffusion equa- Taylor – Galerkin method is appropriate. For the tion (stable below curve) convective diffusion equation the method is a) MacCormack; b) Centered finite difference; c) Taylor       finite difference; d) Upstream; e) Galerkin; f) Taylor – 1 cn+1−cn + 2 cn+1−cn + 1 cn+1−cn Galerkin 6 i+1 i+1 3 i i 6 i−1 i−1     2 2 = − u∆t cn −cn + ∆tD + u ∆t cn 2∆x i+1 i−1 ∆x2 2∆x2 i+1 Each of these methods tries to avoid oscil-  lations that would disappear if the mesh were n n −2ci +ci−1 Mathematics in Chemical Engineering 79

fine enough. For the steady convective diffusion with boundary and initial conditions equation these oscillations do not occur provided c(x,0)=0 u∆x Pe∆ = < =1 (36) 2D 2 c (0,t)=1,c(L,t)=0 For large u, ∆x must be small to meet this con- A solution of the form dition. An alternative is to use a small ∆x in c (x,t)=T (t) X (x) regions where the solution changes drastically. Because these regions change in time, the el- is attempted and substituted into the equation, ements or grid points must move. The crite- with the terms separated to give ria to move the grid points can be quite com- 1 dT 1 d2X plicated, and typical methods are reviewed in = DT dt X dx2 [107]. The criteria include moving the mesh in a known way (when the movement is known a One side of this equation is a function of x alone, priori), moving the mesh to keep some prop- whereas the other side is a function of t alone. erty (e.g., first- or second-derivative measures) Thus, both sides must be a constant. Otherwise, uniform over the domain, using a Galerkin or if x is changed one side changes, but the other weighted residual criterion to move the mesh, cannot because it depends on t. Call the constant and Euler – Lagrange methods which move part − λ and write the separate equations of the solution exactly by convection and then dT d2X −λDT , −λX add on some diffusion after that. dt dx2 The final illustration is for adsorption in a The first equation is solved easily packed bed, or chromatography. Equation 35 can be solved when the adsorption phenomenon T (t)=T (0) e−λDt is governed by a Langmuir isotherm. and the second equation is written in the form αc f (c)= d2X 1+Kc +λX =0 dx2 Similar numerical considerations apply and sim- ilar methods are available [110 – 112]. Next consider the boundary conditions. If they are written as c (L,t)=1=T (t) X (L) 8.3. Parabolic Equations in One c (0,t)=0=T (t) X (0) Dimension the boundary conditions are difficult to satisfy In this section several methods are applied to because they are not homogeneous, i.e. with a parabolic equations in one dimension: separa- zero right-hand side. Thus, the problem must be tion of variables, combination of variables, fi- transformed to make the boundary conditions nite difference method, finite element method, homogeneous. The solution is written as the sum and the orthogonal collocation method. Separa- of two functions, one of which satisfies the non- tion of variables is successful for linear prob- homogeneous boundary conditions, whereas the lems, whereas the other methods work for linear other satisfies the homogeneous boundary con- or nonlinear problems. The finite difference, the ditions. finite element, and the orthogonal collocation c (x,t)=f (x)+u (x,t) methods are numerical, whereas the separation or combination of variables can lead to analyti- u (0,t)=0 cal solutions. u (L,t)=0 Analytical Solutions. Consider the diffu- Thus, f (0) = 1 and f (L) = 0 are necessary. Now sion equation the combined function satisfies the boundary ∂c ∂2c = D conditions. In this case the function f (x) can ∂t ∂x2 be taken as 80 Mathematics in Chemical Engineering

Each eigenvalue has a corresponding eigenfunc- f (x)=L−x tion X (x)=Esin n π x/L The equation for u is found by substituting for c n in the original equation and noting that the f (x) The composite solution is then drops out for this case; it need not disappear in nπx X (x) T (t)=EAsin e−λn Dt the general case: n n L ∂u ∂2u = D This function satisfies the boundary conditions ∂t ∂x2 and differential equation but not the initial con- The boundary conditions for u are dition. To make the function satisfy the initial condition, several of these solutions are added u (0,t)=0 up, each with a different eigenfunction, and EA u (L,t)=0 is replaced by An . ∞ nπx −n2π2 Dt/L2 The initial conditions for u are found from the u (x,t)= Ansin e L initial condition n=1 x u (x,0) = c (x,0) −f (x)= −1 The constants An are chosen by making u (x, t) L satisfy the initial condition. Separation of variables is now applied to this ∞ nπx x u (x,0) = A sin = −1 equation by writing n L L n=1 u (x,t)=T (t) X (x) The residual R (x) is defined as the error in the The same equation for T (t) and X (x)isob- initial condition: tained, but with X (0) = X (L)=0. x ∞ nπx 2 R (x)= −1− An sin d X +λX =0 L L dx2 n=1 X (0) = X (L)=0 Next, the Galerkin method is applied, and the residual is made orthogonal to a complete set of Next X (x) is solved for. The equation is an functions, which are the eigenfunctions. eigenvalue problem. The general solution is ob- mx 2 L tained by using e and finding that m + λ = x −1 sin mπxdx √ L L 0; thus m = ±i λ. The exponential term 0 √ ∞ L ±i λx = A sin mπx sin nπxdx = Am e n L L 2 n=1 0 is written in terms of sines and cosines, so that The Galerkin criterion for finding An is the same the general solution is as the least-squares criterion [3, p. 183]. The so- √ √ X = Bcos λx+Esin λx lution is then ∞ x nπx 2 2 2 The boundary conditions are c (x,t)=1− + A sin e−n π Dt/L L n L √ √ n=1 X (L)=Bcos λL+Esin λL =0 This is an “exact” solution to the linear prob- X (0) = B =0 lem. It can be evaluated to any desired accuracy by taking more and more terms, but if a finite If B = 0, then E = 0 is required to have any so- number of terms are used, some error always oc- lution at all. Thus, λ must satisfy √ curs. For large times a single term is adequate, sin λL =0 whereas for small times many terms are needed. λ For small times the Laplace transform method This is true for certain values of , called eigen- is also useful, because it leads to solutions that values or characteristic values. Here, they are converge with fewer terms. For small times, the 2 2 2 λn = n π /L method of combination of variables may be used Mathematics in Chemical Engineering 81 as well. For nonlinear problems, the method of separation of variables fails and one of the other c (x,t)=1−erf η = erfc η methods must be used. η 2 The method of combination of variables is e−ξ dξ 0 useful, particularly when the problem is posed erf η = ∞ e−ξ2dξ in a semi-infinite domain. Here, only one exam- 0 ple is provided; more detail is given in [3, 113, This is a tabulated function [23]. 114]. The method is applied here to the nonlinear problem     Numerical Methods. Numerical methods ∂c ∂ ∂c ∂2c d D (c) ∂c 2 = D (c) = D (c) + are applicable to both linear and nonlinear prob- ∂t ∂x ∂x ∂x2 dc ∂x lems on finite and semi-infinite domains. The with boundary and initial conditions finite difference method is applied by using the method of lines [115]. In this method the same c (x,0)=0 equations are used for the spatial variations of the function, but the function at a grid point c (0,t)=1,c(∞,t)=0 can vary with time. Thus the linear diffusion The transformation combines two variables into problem is written as one dci ci+1−2ci+ci−1 = D (37) x dt ∆x2 c (x,t)=f (η) where η = √ 4D0t This can be written in the general form dc The use of the 4 and D0 makes the analysis below = AAc simpler. The equation for c (x, t) is transformed dt into an equation for f (η) This set of ordinary differential equations can ∂c = df ∂η , ∂c = df ∂η be solved by using any of the standard methods. ∂t dη ∂t ∂x dη ∂x The stability of explicit schemes is deduced from   2 2 2 2 ∂ c = d f ∂η + df ∂ η the theory presented in Chapter 6. The equations ∂x2 dη2 ∂x dη ∂x2 are written as x/2 2 ∂η = − √ , ∂η = √ 1 , ∂ η =0 n+1 ∂t 3 ∂x 4D t ∂x2 dc c −2c +c D 4D0t 0 i = D i+1 i i−1 = B c dt ∆x2 ∆x2 ij j The result is j=1   d K (c) df +2η df =0 where the matrix B is tridiagonal. The stability dη dη dη of the integration of these equations is governed

K (c) = D (c) /D0 by the largest eigenvalue of B. If Euler’s method is used for integration, The boundary conditions must also combine. In D 2 η ∆t ≤ this case the variable is infinite when either x is ∆x2 |λ| infinite or t is zero. Note that the boundary con- max ditions on c (x, t) are both zero at those points. The largest eigenvalue of B is bounded by the Thus, the boundary conditions can be combined Gerschgorin theorem [14, p. 135]. to give n |λ|max≤max2

Implicit methods can also be used. Write a n+1 n ci −ci u(ci+1−ci−1) D finite difference form for the time derivative and + = (ci+1−2ci+ci−1) ∆t 2∆x ∆x2 average the right-hand sides, evaluated at the old and new times: This is the same equation as obtained using the finite difference method. This is not always cn+1−cn cn −2cn+cn i i = D (1−θ) i+1 i i−1 ∆t ∆x2 true, and the finite volume equations are easy to derive. In two- and three-dimensions, the mesh cn+1−2cn+1+cn+1 +Dθ i+1 i i−1 ∆x2 need not be rectangular, as long as it is possi- ble to compute the velocity normal to an edge of Now the equations are of the form   the cell. The finite volume method is useful for − D∆tθ cn+1+ 1+2 D∆tθ cn+1− D∆tθ cn+1 applications involving filling, such as injection ∆x2 i+1 ∆x2 i ∆x2 i−1   molding, when only part of the cell is filled with = cn+ D∆t(1−θ) cn −2cn+cn i ∆x2 i+1 i i−1 fluid. Such applications do involve some approx- imations, since the interface is not tracked pre- and require solving a set of simultaneous equa- cisely, but they are useful engineering approxi- tions, which have a tridiagonal structure. Using mations. θ = 0 gives the Euler method (as above); θ = 0.5 gives the Crank – Nicolson method; θ = 1 gives the backward Euler method. The stability limit is given by D∆t 0.5 ≤ ∆x2 1−2θ whereas the oscillation limit is given by D∆t 0.25 ≤ Figure 30. ∆x2 1−θ If a time step is chosen between the oscillation The finite element method is handled in a sim- limit and stability limit, the solution will oscil- ilar fashion, as an extension of two-point bound- late around the exact solution, but the oscilla- ary value problems by letting the solution at the tions remain bounded. For further discussion, nodes depend on time. For the diffusion equation see [3, p. 218]. the finite element method gives Finite volume methods are utilized exten- dce Ce I = Be ce sively in computational fluid dynamics. In this e I JI dt e I JI I method, a mass balance is made over a cell ac- counting for the change in what is in the cell and with the mass matrix defined by the flow in and out. Figure 30 illustrates the ge- 1 e ometry of the i-th cell. A mass balance made on CJI =∆xe NJ (u) NI (u)du this cell (with area A perpendicular to the paper) 0 is This set of equations can be written in matrix n+1 n A∆x(ci −ci )=∆tA(Jj−1/2−Ji+1/2) form dc where J is the flux due to convection and diffu- CC = AAc sion, positive in the +x direction. dt

∂c ci−c Now the matrix CCis not diagonal, so that a J = uc−D ,J = u c −D i−1/2 ∂x i−1/2 i−1/2 i−1/2 ∆x set of equations must be solved for each time step, even when the right-hand side is evaluated The concentration at the edge of the cell is explicitly. This is not as time-consuming as it taken as seems, however. The explicit scheme is written 1 c = (c +c ) as i−1/2 2 i i−1 Mathematics in Chemical Engineering 83

cn+1−cn i i n Spectral methods employ the discrete Fourier CCji = AAjici ∆t transform (see 2 and Chebyshev polynomials on and rearranged to give rectangular domains [116]).   In the Chebyshev collocation method, N +1 CC cn+1−cn =∆tAA cn ji i i ji i or collocation points are used

CC cn+1−cn =∆tAAc πj x = cos ,j=0,1,. . .,N j N This is solved with an LUdecomposition (see Section 1.1) that retains the structure of the mass As an example, consider the equation matrix CC. Thus, ∂u ∂u +f (u) =0 CC = LU ∂t ∂x An explicit method in time can be used At each step, calculate  un+1−un ∂un n+1 n −1 −1 n +f (un)  =0 c −c =∆tU L AAc ∆t ∂x This is quick and easy to do because the inverse and evaluated at each collocation point of L and U are simple. Thus the problem is re-  un+1−un n duced to solving one full matrix problem and j j n ∂u +f uj  =0 then evaluating the solution for multiple right- ∆t ∂x j hand sides. For implicit methods the same ap- The trial function is taken as proach can be used, and the LU decomposition N remains fixed until the time step is changed. πpj n n uj (t)= ap (t) cos ,u = uj (t ) (39) The method of orthogonal collocation uses a N j similar extension: the same polynomial of x is p=0 n used but now the coefficients depend on time. Assume that the values uj exist at some time.  ∂c dc (x ,t) dc Then invert Equation 39 using the fast Fourier  = j = j ∂t dt dt transform to obtain {ap } for p =0,1,...,N; xj then calculate Sp Thus, for diffusion problems Sp = Sp+2+(p+1) ap+1, 0≤p≤N−1 dc N +2 j = B c ,j=2,. . .,N+1 dt ji i SN =0,SN+1 =0 i=1 This can be integrated by using the standard and finally methods for ordinary differential equations as (1) 2Sp ap = initial value problems. Stability limits for ex- cp plicit methods are available [3, p. 204]. Thus, the first derivative is given by The method of orthogonal collocation on fi- N nite elements can also be used, and details are ∂u (1) πpj | = ap (t) cos provided elsewhere [3, pp. 228 – 230]. ∂x j N The maximum eigenvalue for all the methods p=0 is given by This is evaluated at the set of collocation points LB by using the fast Fourier transform again. Once |λ| = (38) max ∆x2 the function and the derivative are known at each collocation point the solution can be advanced where the values of LB are as follows: forward to the n + 1-th time level. The advantage of the spectral method is that Finite difference 4 Galerkin, linear elements, lumped 4 it is very fast and can be adapted quite well to Galerkin, linear elements 12 parallel computers. It is, however, restricted in Galerkin, quadratic elements 60 the geometries that can be handled. Orthogonal collocation on finite elements, 36 cubic 84 Mathematics in Chemical Engineering

8.4. Elliptic Equations complicated expressions are needed and the fi- nite element method may be the better method. Elliptic equations can be solved with both fi- Equation 40 is rewritten in the form   nite difference and finite element methods. One- 2 2 2 1+ ∆x T = T +T + ∆x dimensional elliptic problems are two-point ∆y2 i,j i+1,j i−1,j ∆y2 boundary value problems and are covered in Q (T +T ) −∆x2 i,j Chapter 7. Two-dimensional elliptic problems i,j+1 i,j−1 k are often solved with direct methods, and it- And this is converted to the Gauss–Seidel itera- erative methods are usually used for three- tive method.   dimensional problems. Thus, two aspects must 2 2 1+ ∆x T s+1 = T s +T s+1 be considered: how the equations are discretized ∆y2 i,j i+1,j i−1,j   to form sets of algebraic equations and how the 2 + ∆x T s +T s+1 −∆x2 Qi,j algebraic equations are then solved. ∆y2 i,j+1 i,j−1 k The prototype elliptic problem is steady-state Calculations proceed by setting a low i, com- heat conduction or diffusion,   puting from low to high j, then increasing i and ∂2T ∂2T k + = Q repeating the procedure. The relaxation method ∂x2 ∂y2 uses   2 2 possibly with a heat generation term per unit 2 1+ ∆x T ∗ = T s +T s+1 + ∆x ∆y2 i,j i+1,j i−1,j ∆y2 volume, Q. The boundary conditions can be   T s −T s+1 −∆x2 Qi,j i,j+1 i,j−1 k Dirichlet or 1st kind: T = T 1 on boundary S1 T s+1 = T s +β T ∗ −T s ∂T i,j i,j i,j i,j Neumann or 2nd kind: k ∂n = q2 on boundary S2 If β = 1, this is the Gauss – Seidel method. If β −k ∂T = h (T −T ) Robin, mixed, or 3rd kind: ∂n 3 on > β< boundary S3 1, it is overrelaxation; if 1, it is underre- laxation. The value of β may be chosen empiri- Illustrations are given for constant physical cally, 0 <β<2, but it can be selected theoreti- properties k, h, while T 1, q2, T 3 are known func- cally for simple problems like this [117, p. 100], tions on the boundary and Q is a known function [3, p. 282]. In particular, the optimal value of the of position. For clarity, only a two-dimensional iteration parameter is given by problem is illustrated. The finite difference for- −ln (β −1) ≈R mulation is given by using the following nomen- opt clature and the error (in solving the algebraic equation) is decreased by the factor (1 − R)N for every N Ti,j = T (i∆x,j∆y) iterations. For the heat conduction problem and The finite difference formulation is then Dirichlet boundary conditions,

Ti+1,j −2Ti,j +Ti−1,j π2 ∆x2 R = (40) 2n2 + Ti,j+1−2Ti,j +Ti,j−1 = Q /k ∆y2 i,j (when there are n points in both x and y direc- tions). For Neumann boundary conditions, the Ti,j = T1 for i,j on boundary S1 value is ∂T k | = q i,j S 2 ∂n i,j 2 for on boundary 2 π 1 R = 2n2 1+max [∆x2/∆y2,∆y2/∆x2] −k ∂T | = h (T −T ) i,j S ∂n i,j i,j 3 for on boundary 3 Iterative methods can also be based on lines If the boundary is parallel to a coordinate axis the (for 2D problems) or planes (for 3D problems). boundary slope is evaluated as in Chapter 7, by Preconditioned conjugate gradient methods using either a one-sided, centered difference or have been developed (see Chap. 1). In these a false boundary. If the boundary is more irreg- methods a series of matrix multiplications are ular and not parallel to a coordinate line, more done iteration by iteration; and the steps lend Mathematics in Chemical Engineering 85 themselves to the efficiency available in paral- represent a large set of linear equations, which lel computers. In the multigrid method the prob- are solved using matrix techniques (Chap. 1). lem is solved on several grids, each more refined If the problem is nonlinear, e.g., with k or Q than the previous one. In iterating between the a function of temperature, the equations must be solutions on different grids, one converges to the solved iteratively. The integrals are given for a solution of the algebraic equations. A chemical triangle with nodes I, J, and K in counterclock- engineering application is given in [118]. Soft- wise order. Within an element, ware for a variety of these methods is available, T = N (x,y) T +N (x,y) T +N (x,y) T as described below. I I J J K K N = aI +bI x+cI y The Galerkin finite element method (FEM) is I 2∆ useful for solving elliptic problems and is partic- a = x y −x y ularly effective when the domain or geometry is I J K K J irregular [119 – 125]. As an example, cover the bI = yI −yK domain with triangles and define a trial func- c = x −x tion on each triangle. The trial function takes I K J the value 1.0 at one corner and 0.0 at the other plus permutation on I,K,J corners, and is linear in between (see Fig. 31).   1 xI yI   These trial functions on each triangle are pieced   2∆ = det  1 x y  =2( ) together to give a trial function on the whole  J J  area of triangle domain. For the heat conduction problem the 1 xK yK method gives [3] aI +aJ +aK =1

bI +bJ +bK =0

cI +cJ +cK =0 Ae = − k (b b +c c ) IJ 4∆ I J I J F e = Q (a +b x+c y)= QD IJ 2 I I I 3 x = xI +xJ +xK , y = yI +yJ +yK 3 3 a +b x+c y = 2 ∆ I I I 3 The trial functions in the finite element method are not limited to linear ones. Quadratic functions, and even higher order functions, are frequently used. The same considerations hold as for boundary value problems: the higher order Figure 31. Finite elements trial function: linear polynomi- trial functions converge faster but require more als on triangles work. It is possible to refine both the mesh (h) and power of polynomial in the trial function

e e e (p) in an hp method. Some problems have con- A T = F (41) e J IJ J e J I straints on some of the variables. If singularities where exist in the solution, it is possible to include them   e in the basis functions and solve for the difference AIJ = − k∇NI ·∇NJ dA− h3NI NJ dC C3 between the total solution and the singular func-    e FI = NI QdA+ NI q2dC− NI h3T3dC tion [126 – 129]. C2 C3 When applying the Galerkin finite element Also, a necessary condition is that method, one must choose both the shape of the element and the basis functions. In two dimen- T = T C i 1 on 1 sions, triangular elements are usually used be- In these equations I and J refer to the nodes of cause it is easy to cover complicated geometries the triangle forming element e and the summa- and refine parts of the mesh. However, rectan- tion is made over all elements. These equations gular elements sometimes have advantages, par- 86 Mathematics in Chemical Engineering

T n+1−T n ticularly when some parts of the solution do not ρC i,j i,j p ∆t change very much and the elements can be long.   = k T n −2T n +T n In three dimensions the same considerations ap- ∆x2 i+1,j i−1,j i−1,j ply: tetrahedral elements are frequently used, but   + k T n −2T n +T n −Q brick elements are also possible. While linear el- ∆y2 i,j+1 i,j i,j−1 ements (in both two and three dimensions) are When Q = 0 and ∆x = ∆y, the time step is lim- usually used, higher accuracy can be obtained by ited by using quadratic or cubic basis functions within 2 2 the element. The reason is that all methods con- ∆x ?Cp ∆x ∆t≤ or verge according to the mesh size to some power, 4k 4D and the power is larger when higher order ele- These time steps are smaller than for one- ments are used. If the solution is discontinuous, dimensional problems. For three dimensions, or has discontinuous first derivatives, then the the limit is lowest order basis functions are used because ∆x2 the convergence is limited by the properties of ∆t≤ the solution, not the finite element approxima- 6D tion. To avoid such small time steps, which must be One nice feature of the finite element method smaller when ∆x decreases, an implicit method is the use of natural boundary conditions. In could be used. This leads to large sparse matri- this problem the natural boundary conditions are ces, rather than convenient tridiagonal matrices. the Neumann or Robin conditions. When using Equation 41, the problem can be solved on a do- main that is shorter than needed to reach some 8.6. Special Methods for Fluid limiting condition (such as at an outflow bound- Mechanics ary). The externally applied flux is still applied at the shorter domain, and the solution inside The method of operator splitting is also useful the truncated domain is still valid. Examples are when different terms in the equation are best given in [107] and [131]. The effect of this is to evaluated by using different methods or as a allow solutions in domains that are smaller, thus technique for reducing a larger problem to a se- saving computation time and permitting the so- ries of smaller problems. Here the method is lution in semi-infinite domains. illustrated by using the Navier – Stokes equa- tions. In vector notation the equations are ∂u 8.5. Parabolic Equations in Two or ? +?u·∇u = ?f−∇p+µ∇2u ∂t Three Dimensions The equation is solved in the following steps ∗ n Computations become much more lengthy with ? u −u = −?un·∇un+?f+µ∇2un two or more spatial dimensions, for example, the ∆t ∇2pn+1 = 1 ∇·u∗ unsteady heat conduction equation ∆t   ∂T ∂2T ∂2T n+1 ∗ ?C = k + −Q ? u −u = −∇p p ∂t ∂x2 ∂y2 ∆t or the unsteady diffusion equation This can be done by using the finite differ-   ence [132, p. 162] or the finite element method ∂c ∂2T ∂2c = D + −R (c) [133 – 135]. ∂t ∂x2 ∂y2 In fluid flow problems solved with the finite In the finite difference method an explicit element method, the basis functions for pressure technique would evaluate the right-hand side at and velocity are often different. This is required the n-th time level: by the LBB condition (named after Ladyshen- skaya, Brezzi, and Babuska) [134, 135]. Some- times a discontinuous basis function is used for pressure to meet this condition. Other times a Mathematics in Chemical Engineering 87 penalty term is added, or the quadrature is done Consider the lattice shown in Figure 32. The var- using a small number of quadrature points. Thus, ious velocities are one has to be careful how to apply the finite el- c =(1,0),c =(0,1),c =(−1,0),c =(0,−1) ement method to the Navier – Stokes equations. 1 3 5 7

Fortunately, software exists that has taken this c2 =(1,1),c4 =(−1,1),c6 =(−1,−1),c8 =(1,−1) into account. c0 =(0,0) Level Set Methods. Multiphase problems are complicated because the terms in the equa- The density, velocity, and shear stress (for tions depend on which phase exists at a partic- some k) are given by ular point, and the phase boundary may move ρ = fi(x,t),ρuu= cifi(x,t), or be unknown. It is desirable to compute on i i a fixed grid, and the level set formulation al- eq τ = k cici [fi( x,t)−f (x,t)] lows this. Consider a curved line in a two- i i dimensional problem or a curved surface in a For this formulation, the kinematic viscosity three-dimensional problem. One defines a level and speed of sound are given by set function φ, which is the signed distance func- 1 1 1 tion giving the distance from some point to the ν = (τ− ),cs = closest point on the line or surface. It defined 3 2 3 to be negative on one side of the interface and The equilibrium distribution is positive on the other. Then the curve & ' c ·u (c ·uu)2 u·u f eq = ρw 1+ i + i − i i 2 4 2 φ(x,y,z)=0 cs 2cs 2cs represents the location of the interface. For ex- where the weighting functions are ample, in flow problems the level set function is 4 1 w = ,w = w = w = w = , defined as the solution to 0 9 1 3 5 7 9 ∂φ 1 +u·∇φ =0 w = w = w = w = ∂t 2 4 6 8 36 The physics governing the velocity of the With these conditions, the solution for veloc- interface must be defined, and this equation is ity is a solution of the Navier—Stokes equation. solved along with the other equations repre- These equations lead to a large computational senting the problem [130 – 137]. problem, but it can be solved by parallel pro- cessing on multiple computers. Lattice Boltzmann Methods. Another way to solve fluid flow problems is based on a mo- lecular viewpoint and is called the Lattice Boltz- mann method [138 – 141]. The treatment here follows [142]. A lattice is defined, and one solves the following equation for fi (x,t), the probabil- ity of finding a molecule at the point x with speed ci . ∂f f −f eq i +c ·∇f = − i i ∂t i i τ The right-hand side represents a single time relaxation for molecular collisions, and τ is re- lated to the kinematic viscosity. By means of a simple stepping algorithm, the computational Figure 32. algorithm is ∆t f (xxx+cc ∆t,t+∆t)=f (xx,t)− (f −f eq) i i i τ i i 88 Mathematics in Chemical Engineering

8.7. Computer Software introduces the same complexity that occurs in problems with a large Peclet number, with the A variety of general-purpose computer pro- added difficulty that the free surface moves bet- grams are available commercially. Mathe- ween mesh points, and improper representation matica (http://www.wolfram.com/), Maple can lead to unphysical oscillations. The method (http://www.maplesoft.com/) and Mathcad used to solve the equations is important, and (http://www.mathcad.com/) all have the capa- both explicit and implicit methods (as described bility of doing symbolic manipulation so that above) can be used. Implicit methods may in- algebraic solutions can be obtained. For ex- troduce unacceptable extra diffusion, so the en- ample, Mathematica can solve some ordinary gineer needs to examine the solution carefully. and partial differential equations analytically; The methods used to smooth unphysical oscilla- Maple can make simple graphs and do linear tions from node to node are also important, and algebra and simple computations, and Math- the engineer needs to verify that the added diffu- cad can do simple calculations. In this sec- sion or smoothing does not give inaccurate so- tion, examples are given for the use of Mat- lutions. Since current-day problems are mostly lab (http://www.mathworks.com/), which is a nonlinear, convergence is always an issue since package of numerical analysis tools, some of the problems are solved iteratively. Robust pro- which are accessed by simple commands, and grams provide several methods for convergence, some of which are accessed by writing pro- each of which is best in some circumstance or grams in C. Spreadsheets can also be used to other. It is wise to have a program that includes solve simple problems. A popular program used many iterative methods. If the iterative solver in chemical engineering education is Polymath is not very robust, the only recourse to solving (http://www.polymath-software.com/), which a steady-state problem may be to integrate the can numerically solve sets of linear or nonlinear time-dependent problem to steady state. The so- equations, ordinary differential equations as ini- lution time may be long, and the final result may tial value problems, and perform data analysis be further from convergence than would be the and regression. case if a robust iterative solver were used. The mathematical methods used to solve par- A variety of computer programs is avail- tial differential equations are described in more able on the internet, some of them free. First detail in [143 – 148]. Since many computer pro- consider general-purpose programs. On the grams are available without cost, consider the NIST web page, http://gams.nist.gov/ choose following decision points. The first decision is “problem decision tree”, and then “differen- whether to use an approximate, engineering flow tial and integral equations”, then “partial dif- model, developed from correlations, or to solve ferential equations”. The programs are orga- the partial differential equations that govern the nized by type of problem (elliptic, parabolic, problem. Correlations are quick and easy to ap- and hyperbolic) and by the number of spatial ply, but they may not be appropriate to your dimensions (one or more than one). On the problem, or give the needed detail. When us- Netlib web site, http://www.netlib.org/, search ing a computer package to solve partial differ- on “partial differential equation”. The website: ential equations, the first task is always to gener- http://software.sandia.gov has a variety of pro- ate a mesh covering the problem domain. This is grams available. Lau [141, 145] provides many not a trivial task, and special methods have been programs in C++ (also see http://www.nr.com/). developed to permit importation of a geometry The multiphysics program Comsol Multiphysics from a computer-aided design program. Then, (formerly FEMLAB) also solves many standard the mesh must be created automatically. If the equations arising in Mathematical Physics. boundary is irregular, the finite element method Computational fluid dynamics (CFD) (→ is especially well-suited, although special em- Computational Fluid Dynamics) programs are bedding techniques can be used in finite differ- more specialized, and most of them have been ence methods (which are designed to be solved designed to solve sets of equations that are ap- on rectangular meshes). Another capability to propriate to specific industries. They can then in- consider is the ability to track free surfaces that clude approximations and correlations for some move during the computation. This phenomenon features that would be difficult to solve for di- Mathematics in Chemical Engineering 89 rectly. Four widely used major packages are said to be weakly singular. A Volterra equation Fluent (http://www.fluent.com/), CFX (now part of the second kind is of ANSYS), Comsol Multiphysics (formerly t FEMLAB) (http://www.comsol.com/), and AN- y (t)=g (t)+λ K(t,s)y (s)ds (42) SYS (http://www.ansys.com/). Of these, Com- a sol Multiphysics is particularly useful because it has a convenient graphical user interface, whereas a Volterra equation of the first kind is permits easy mesh generation and refinement t (including adaptive mesh refinement), allows y (t)=λ K (t,s) y (s)ds the user to add in phenomena and additional a equations easily, permits solution by continu- ation methods (thus enhancing convergence), Equations of the first kind are very sensitive to and has extensive graphical output capabilities. solution errors so that they present severe nu- Other packages are also available (see http://cfd- merical problems. online.com/), and these may contain features An example of a problem giving rise to a and correlations specific to the engineer’s indus- Volterra equation of the second kind is the fol- try. One important point to note is that for tur- lowing heat conduction problem: bulent flow, all the programs contain approxima- ∂T ∂2T ?Cp = k 2 , 0≤x<∞,t>0 tions, using the k-epsilon models of turbulence, ∂t ∂x T (x,0) = 0, ∂T (0,t)=−g (t) , or large eddy simulations; the direct numerical ∂x simulation of turbulence is too slow to apply it to lim T (x,t)=0, lim ∂T =0 very big problems, although it does give insight x→∞ x→∞ ∂x (independent of any approximations) that is use- ful for interpreting turbulent phenomena. Thus, If this is solved by using Fourier transforms the the method used to include those turbulent cor- solution is relations is important, and the method also may t 1 1 2 T (x,t)= √ g (s) √ e−x /4(t−s)ds affect convergence or accuracy. π t−s 0 Suppose the problem is generalized so that the 9. Integral Equations [149 – 155] boundary condition is one involving the solution T, which might occur with a radiation boundary If the dependent variable appears under an inte- condition or heat-transfer coefficient. Then the gral sign an equation is called an integral equa- boundary condition is written as tion; if derivatives of the dependent variable ap- ∂T pear elsewhere in the equation it is called an inte- = −G (T,t) ,x=0,t>0 grodifferential equation. This chapter describes ∂x the various classes of equations, gives informa- The solution to this problem is tion concerning Green’s functions, and presents t numerical methods for solving integral equa- 1 1 2 T (x,t)= √ G (T (0,s) ,s) √ e−x /4(t−s)ds tions. π t−s 0 If T*(t) is used to represent T (0, t), then 9.1. Classification t 1 1 T ∗ (t)= √ G (T (s) ,s) √ ds Volterra integral equations have an integral with π t−s a variable limit, whereas Fredholm integral 0 equations have a fixed limit. Volterra equations Thus the behavior of the solution at the bound- are usually associated with initial value or evolu- ary is governed by an integral equation. Nagel tionary problems, whereas Fredholm equations and Kluge [156] use a similar approach to solve are analogous to boundary value problems. The for adsorption in a porous catalyst. terms in the integral can be unbounded, but still The existence and uniqueness of the solution yield bounded integrals, and these equations are can be proved [151, p. 30, 32]. 90 Mathematics in Chemical Engineering

t Sometimes the kernel is of the form y (t)=y (0) + F (s,y (s)) ds K (t,s)=K (t−s) 0 which is a nonlinear Volterra equation. The gen- Equations of this form are called convolution eral nonlinear Volterra equation is equations and can be solved by taking the t Laplace transform. For the integral equation y (t)=g (t)+ K (t,s,y (s)) ds (45) t 0 Y (t)=G (t)+λ K (t−τ) Y (τ)dτ Theorem [ 151, p. 55]. If g (t) is continuous, 0 t the kernel K (t, s, y) is continuous in all variables K (t) ∗Y (t) ≡ K (t−τ) Y (τ)dτ and satisfies a Lipschitz condition 0 |K (t,s,y) −K (t,s,z)|≤L |y−z| the Laplace transform is then the nonlinear Volterraequation has a unique y (s)=g (s)+k (s) y (s) continuous solution. k (s) y (s)=L [K (t) ∗Y (t)] A successive substitution method for its so- lution is Solving this for y (s)gives t g (s) yn+1 (t)=g (t)+ K [t,s,yn (s)] ds y (s)= 1−k (s) 0 Nonlinear Fredholm equations have special If the inverse transform can be found, the inte- names. The equation gral equation is solved. 1 A Fredholm equation of the second kind is f (x)= K [x,y,f (y)] dy b 0 y (x)=g (x)+λ K (x,s) y (s)ds (43) is called the Urysohn equation [150 p. 208]. The a special equation whereas a Fredholm equation of the first kind is 1 b f (x)= K [x,y] F [y,f (y)] dy K (x,s) y (s)ds = g (x) 0 a is called the Hammerstein equation [150, p. 209]. Iterative methods can be used to solve The limits of integration are fixed, and these these equations, and these methods are closely problems are analogous to boundary value prob- tied to fixed point problems. A fixed point prob- lems. An eigenvalue problem is a homogeneous lem is equation of the second kind. x = F (x) b y (x)=λ K (x,s) y (s)ds (44) and a successive substitution method is a xn+1 = F (xn) Solutions to this problem occur only for spe- Local convergence theorems prove the process λ cific values of , the eigenvalues. Usually the convergent if the solution is close enough to Fredholm equation of the second or first kind the answer, whereas global convergence theo- λ is solved for values of different from these, rems are valid for any initial guess [150, p. 229 which are called regular values. – 231]. The successive substitution method for Nonlinear Volterra equations arise naturally nonlinear Fredholm equations is from initial value problems. For the initial value 1 problem yn+1 (x)= K [x,s,yn (s)] ds dy = F (t,y (t)) 0 dt Typical conditions for convergence include that both sides can be integrated from 0 to t to obtain the function satisfies a Lipschitz condition. Mathematics in Chemical Engineering 91 9.2. Numerical Methods for Volterra 9.3. Numerical Methods for Fredholm, Equations of the Second Kind Urysohn, and Hammerstein Equations of the Second Kind Volterra equations of the second kind are anal- ogous to initial value problems. An initial value Whereas Volterra equations could be solved problem can be written as a Volterra equation of from one position to the next, like initial value the second kind, although not all Volterra equa- differential equations, Fredholm equations must tions can be written as initial value problems be solved over the entire domain, like bound- [151, p. 7]. Here the general nonlinear Volterra ary value differential equations. Thus, large sets equation of the second kind is treated (Eq. 45). of equations will be solved and the notation is The simplest numerical method involves replac- designed to emphasize that. ing the integral by a quadrature using the trape- The methods are also based on quadrature zoid rule. formulas. For the integral # 1 b yn≡y (tn)=g (tn)+∆t K (tn,t0,y0) 2 I (ϕ)= ϕ (y)dy

n−1 1 a + K (t ,t ,y )+ K (t ,t ,y ) n i i 2 n n n i=1 a quadrature formula is written: This equation is a nonlinear algebraic equation n I (ϕ)= wiϕ (yi) for yn . Since y0 is known it can be applied to i=0 solve for y1, y2,...insuccession. For a single integral equation, at each step one must solve a Then the integral Fredholm equation can be single nonlinear algebraic equation for yn . Typ- rewritten as ically, the error in the solution to the integral n equation is proportional to ∆t µ, and the power f (x) −λ wiK (x,yi) f (yi)=g (x) , µ i=0 (46) is the same as the power in the quadrature error a≤x≤b [151, p. 97]. The stability of the method [151, p. 111] can If this equation is evaluated at the points x=yj , be examined by considering the equation n t f (yj ) −λ wiK (yj ,yi) f (yi)=g (yi) y (t)=1−λ y (s)ds i=0 0 is obtained, which is a set of linear equations to be solved for { f ( yj )}. The solution at any whose solution is point is then given by Equation 46. y (t)=e−λt A common type of integral equation has a singular kernel along x=y. This can be trans- Since the integral equation can be differentiated formed to a less severe singularity by writing to obtain the initial value problem b b dy K (x,y) f (y)dy = K (x,y)[f (y) −f (x)] dy = −λy, y (0)=1 a a dt b b + K (x,y) f (x)dy = K (x,y)[f (y) the stability results are identical to those for ini- a a tial value methods. In particular, using the trape- −f (x)] dy+f (x) H (x) zoid rule for integral equations is identical to where using this rule for initial value problems. The method is A-stable. b Higher order integration methods can also be H (x)= K (x,y) f (x)dy used [151, p. 114, 124]. When the kernel is infi- a nite at certain points, i.e., when the problem has is a known function. The integral equation is then a weak singularity, see [151, p. 71, 151]. replaced by 92 Mathematics in Chemical Engineering n f (x)=g (x)+ wiK (x,yi)[f (yi) −f (x)] 9.4. Numerical Methods for Eigenvalue i=0 Problems +f (x) H (x) Eigenvalue problems are treated similarly to Collocation methods can be applied as well Fredholm equations, except that the final equa- [149, p. 396]. To solve integral Equation 43 ex- tion is a matrix eigenvalue problem instead of a pand f in the function set of simultaneous equations. For example, n n f = aiϕi (x) wiK (yi,yi) f (yi)=λf (yj ) , i=0 i=1 Substitute f into the equation to form the resid- i =0,1,. . .,n ual leads to the matrix eigenvalue problem

n n b KDf = λf aiϕi (x) −λ ai K (x,y) ϕi (y)dy = g (x) Where D is a diagonal matrix with Dii = wi . i=0 i=0 a Evaluate the residual at the collocation points 9.5. Green’s Functions [158 – 160] n n b aiϕi (xj ) −λ ai K (xj ,y) ϕi (y)dy = g (xj ) i=0 i=0 Integral equations can arise from the formula- a tion of a problem by using Green’s function. The expansion can be in piecewise polynomials, For example, the equation governing heat con- leading to a collocation finite element method, or duction with a variable heat generation rate is global polynomials, leading to a global approxi- represented in differential forms as mation. If orthogonal polynomials are used then d2T Q (x) = ,T(0) = T (1) = 0 the quadratures can make use of the accurate dx2 k Gaussian quadrature points to calculate the inte- In integral form the same problem is [149, pp. grals. Galerkin methods are also possible [149, 57 – 60] Mills p. 406]. et al. [157] consider reaction – 1 T (x)= 1 G (x,y) Q (y)dy diffusion problems and say the choice of tech- k 0 nique cannot be made in general because it is  highly dependent on the kernel.  −x (1−y) x≤y G (x,y)= When the integral equation is nonlinear, iter-  −y (1−x) y≤x ative methods must be used to solve it. Con- vergence proofs are available, based on Ba- Green’s functions for typical operators are given nach’s contractive mapping principle. Consider below. the Urysohn equation, with g (x) = 0 without For the Poisson equation with solution decay- loss of generality: ing to zero at infinity b ∇2ψ = −4π? f (x)= F [x,y,f (y)] dy the formulation as an integral equation is a ψ (r)= ρ (r ) G (r,r )dV The kernel satisfies the Lipschitz condition 0 0 0 V maxa≤x,y≤b |F [x,y,f (y)] −F [x,z,f (z)]|≤K |y−z| where Green’s function is [50, p. 891] G (r,r )= 1 Theorem [150, p. 214]. If the constant K is < 0 r in three dimensions 1 and certain other conditions hold, the succes- sive substitution method = −2lnr in two dimensions * b 2 2 2 where r = (x−x0) +(y−y0) +(z−z0) fn+1 (x)= F [x,y,fn (y)] dy,n =0,1... in three dimensions a * 2 2 and r = (x−x0) +(y−x0) converges to the solution of the integral equa- in two dimensions tions. Mathematics in Chemical Engineering 93

For the problem a =1  ∂u = D∇2u, u =0 S,  1+ 2 −x, y≤x ∂t on Sh G (x,y,Sh)=  1+ 2 −y, x

c = ϕ (x,y,z) on S,t>0 u (x) v (y)0≤y≤x K (x,y)= u (y) v (x) x≤y≤1 the solution can be represented as [44, p. 353]  the problem c = (u)τ=0f (x,y,z)dxdydz 1 t  +D ϕ (x,y,z,τ) ∂u dSdt f (x)= K (x,y) F [y,f (y)] dy ∂n 0 0 When the problem is two dimensional, may be written as 2 2 u = √ 1 e−[(x−x0) +(y−y0) ]/4D(t−τ) x 4πD(t−τ) f (x) − [u (x) v (y) −u (y) v (x)] ·  0 c = (u) f (x,y)dxdy τ=0 f [y,f (y)] dy = αv (x) t  +D ϕ (x,y,τ) ∂u dCdt ∂n where 0 1 For the following differential equation and α = u (y) F [y,f (y)] dy boundary conditions 0   Thus, the problem ends up as one directly for- 1 d xa−1 dc = f [x,c (x)] , xa−1 dx dx mulated as a fixed point problem: dc (0)=0, 2 dc (1) +c (1) = g dx Sh dx f = Φ (f) where Sh is the Sherwood number, the problem When the problem is the diffusion – reaction can be written as a Hammerstein integral equa- one, the form is tion: x c (x)=g− [u (x) v (y) −u (y) v (x)] 0 1 f [y,c (y)] ya−1dy−αv (x) c (x)=g− G (x,y,Sh) f [y,c (y)] ya−1dy 1 0 α = u (y) f [y,c (y)] ya−1dy Green’s function for the differential operators 0 are [163] Dixit and Taularidis [164] solved problems involving Fischer – Tropsch synthesis reactions in a catalyst pellet using a similar method. 94 Mathematics in Chemical Engineering

θ=2 π 9.6. Boundary Integral Equations and ∂lnr0 lim ϕ r0dθ = −ϕ (P )2π Boundary Element Method r0→0 ∂n θ=0 The boundary element method utilizes Green’s Thus for an internal point,   theorem and integral equations. Here, the 1 ∂lnr ∂ϕ ϕ (P )= ϕ −lnr dS (47) method is described briefly for the following 2π ∂n ∂n boundary value problem in two or three dimen- S sions If P is on the boundary, the result is [165, p. 464] 2 ∂ϕ   ∇ ϕ =0,ϕ= f1 on S1, = f2 on S2 1 ∂lnr ∂ϕ ∂n ϕ (P )= ϕ −lnr dS π ∂n ∂n Green’s theorem (see page 46) says that for any S functions sufficiently smooth Putting in the boundary conditions gives   ∂ψ ∂ϕ    ϕ∇2ψ−ψ∇2ϕ dV = ϕ −ψ dS πϕ (P )= f ∂lnr −lnr ∂ϕ dS ∂n ∂n 1 ∂n ∂n S1 V S    (48) + ϕ ∂lnr −f lnr dS ψ ∂n 2 Suppose the function satisfies the equation S2 ∇2ψ =0 This is an integral equation for ϕ on the bound- ary. Note that the order is one less than the orig- In two and three dimensions, such a function is inal differential equation. However, the integral * 2 2 equation leads to matrices that are dense rather ψ =lnr, r = (x−x0) +(y−y0) than banded or sparse, so some of the advan- in two dimensions tage of lower dimension is lost. Once this inte- * gral equation (Eq. 48) is solved to find ϕ on the 1 2 2 2 ψ = ,r= (x−x0) +(y−y0) +(z−z0) boundary, it can be substituted in Equation 47 to r ϕ in three dimensions find anywhere in the domain. where {x0, y0} or {x0, y0, z0} is a point in the domain. The solution ϕ also satisfies

∇2ϕ =0 so that   ∂ψ ∂ϕ ϕ −ψ dS =0 ∂n ∂n S Consider the two-dimensional case. Since the function ψ is singular at a point, the inte- grals must be carefully evaluated. For the region Figure 33. Domain with singularity at P shown in Figure 33, the domain is S=S1+ S2; a small circle of radius r0 is placed around the point P at x0, y0. Then the full integral is In the boundary finite element method, both    the function and its normal derivative along the ϕ ∂lnr −lnr ∂ϕ dS ∂n ∂n boundary are approximated. S θ=2 π   N N   + ϕ ∂lnr0 −lnr ∂ϕ r dθ =0 ∂ϕ ∂ϕ ∂n 0 ∂n 0 ϕ = ϕj Nj (ξ) , = Nj (ξ) θ=0 ∂n ∂n j=1 j=1 j As r0 approaches 0, One choice of trial functions can be the piece-

lim r0lnr0 =0 wise constant functions shown in Figure 34. The r0→0 integral equation then becomes and Mathematics in Chemical Engineering 95

∇2ϕ = g (x,y,ϕ) Then the integral appearing in Equation 50 must be evaluated over the entire domain, and the so- lution in the interior is given by Equation 49. For further applications, see [167] and [168].

10. Optimization

Figure 34. Trial function on boundary for boundary finite element method We provide a survey of systematic methods for a broad variety of optimization problems. The   survey begins with a general classification of N   mathematical optimization problems involving  ∂lnri ∂ϕj  πϕ ϕ dS− lnr  ds continuous and discrete (or integer) variables. i j ∂n ∂n i j=1 sj sj This is followed by a review of solution methods of the major types of optimization problems for The function ϕj is of course known along s1, continuous and discrete variable optimization, whereas the derivative ∂ϕj /∂n is known along particularly nonlinear and mixed-integer nonlin- s2. This set of equations is then solved for ϕi and ear programming. In addition, we discuss direct ∂ϕi /∂n along the boundary. This constitutes the search methods that do not require derivative in- boundary integral method applied to the Laplace formation as well as global optimization meth- equation. ods. We also review extensions of these methods If the problem is Poisson’s equation for the optimization of systems described by dif- ∇2ϕ = g (x,y) ferential and algebraic equations.

Green’s theorem gives   ∂lnr ∂ϕ 10.1. Introduction ϕ −lnr dS+ g lnr dA =0 ∂n ∂n S A Optimization is a key enabling tool for decision Thus, for an internal point, making in chemical engineering [306]. It has    evolved from a methodology of academic in- 2πϕ (P )= ϕ ∂lnr −lnr ∂ϕ dS ∂r ∂n terest into a technology that continues to make S  (49) + glnrdA significant impact in engineering research and A practice. Optimization algorithms form the core tools for a) experimental design, parameter esti- and for a boundary point, mation, model development, and statistical anal-    πϕ (P )= ϕ ∂lnr −lnr ∂ϕ dS ysis; b) process synthesis analysis, design, and ∂n ∂n S  (50) retrofit; c) model predictive control and real- + glnrdA time optimization; and d) planning, scheduling, A and the integration of process operations into the If the region is nonhomogeneous this method supply chain [307, 308]. can be used [165, p. 475], and it has been ap- As shown in Figure 35, optimization prob- plied to heat conduction by Hsieh and Shang lems that arise in chemical engineering can be [166]. The finite element method can be ap- classified in terms of continuous and discrete plied in one region and the boundary finite ele- variables. For the former, nonlinear program- ment method in another region, with appropriate ming (NLP) problems form the most general matching conditions [165, p. 478]. If the prob- case, and widely applied specializations include lem is nonlinear, then it is more difficult. For ex- linear programming (LP) and quadratic pro- ample, consider an equation such as Poisson’s gramming (QP). An important distinction for in which the function depends on the solution as NLP is whether the optimization problem is con- well vex or nonconvex. The latter NLP problem may 96 Mathematics in Chemical Engineering have multiple local optima, and an important to a mixed-integer nonlinear program (MINLP) question is whether a global solution is required when any of the functions involved are nonlin- for the NLP. Another important distinction is ear. If all functions are linear it corresponds to whether the problem is assumed to be differ- a mixed-integer linear program (MILP). If there entiable or not. are no 0–1 variables, then problem 51 reduces Mixed-integer problems also include discrete to a nonlinear program 52 or linear program 65 variables. These can be written as mixed-integer depending on whether or not the functions are nonlinear programs (MINLP) or as mixed- linear. integer linear programs (MILP) if all variables appear linearly in the constraint and objective functions. For the latter an important case oc- curs when all the variables are integer; this gives rise to an integer programming (IP) problem. IPs can be further classified into many special prob- lems (e.g., assignment, traveling salesman, etc.), which are not shown in Figure 35. Similarly, the MINLP problem also gives rise to special prob- lem classes, although here the main distinction is whether its relaxation is convex or nonconvex. The ingredients of formulating optimization problems include a mathematical model of the system, an objective function that quantifies a criterion to be extremized, variables that can serve as decisions, and, optionally, inequality constraints on the system. When represented in algebraic form, the general formulation of dis- crete/continuous optimization problems can be Figure 35. Classes of optimization problems and algorithms written as the following mixed-integer optimiza- tion problem: We first start with continuous variable opti- Min f (x,y) mization and consider in the next section the so- s.t. h (x,y)=0 lution of NLPs with differentiable objective and (51) constraint functions. If only local solutions are g (x,y) ≤0 required for the NLP, then very efficient large- x∈n,y∈{0,1}t scale methods can be considered. This is fol- where f (x, y) is the objective function (e.g., lowed by methods that are not based on local cost, energy consumption, etc.), h(x, y)=0 are optimality criteria; we consider direct search op- the equations that describe the performance of timization methods that do not require deriva- the system (e.g., material balances, production tives, as well as deterministic global optimiza- rates), the inequality constraints g(x, y) ≤ 0 can tion methods. Following this, we consider the define process specifications or constraints for solution of mixed-integer problems and outline feasible plans and schedules, and s.t. denotes the main characteristics of algorithms for their subject to. Note that the operator Max f (x)is solution. Finally, we conclude with a discussion equivalent to Min −f (x). We define the real n- of optimization modeling software and its im- vector x to represent the continuous variables plementation in engineering models. while the t-vector y represents the discrete vari- ables, which, without loss of generality, are of- ten restricted to take 0–1 values to define logi- 10.2. Gradient-Based Nonlinear cal or discrete decisions, such as assignment of Programming equipment and sequencing of tasks. (These vari- ables can also be formulated to take on other in- For continuous variable optimization we con- teger values as well.) Problem 51 corresponds sider problem 51 without discrete variables y. Mathematics in Chemical Engineering 97

The general NLP problem 52 is presented be- direction. Invoking a theorem of the alterna- low: tive (e.g., Farkas – Lemma) leads to the cele- brated Karush – Kuhn – Tucker (KKT) condi- f (x) Min tions [169]. Instead of a formal development of s.t. h (x)=0 (52) these conditions, we present here a more intu- g (x) ≤0 itive, kinematic illustration. Consider the con- tour plot of the objective function f (x) given in and we assume that the functions f (x), h(x), and Figure 36 as a smooth valley in space of the vari- g(x) have continuous first and second deriva- ables x1 and x2. For the contour plot of this un- tives. A key characteristic of problem 52 is constrained problem, Min f (x), consider a ball whether the problem is convex or not, i.e., rolling in this valley to the lowest point of f (x), whether it has a convex objective function and a denoted by x*. This point is at least a local min- convex feasible region. A function φ(x)ofx in imum and is defined by a point with zero gra- some domain X is convex if and only if for all dient and at least nonnegative curvature in all points x1, x2 ∈ X: (nonzero) directions p. We use the first deriva- tive (gradient) vector ∇f (x) and second deriva- ∇ φ [αx1+(1−α) x2] ≤αφ (x1)+(1−α) φ (x2) (53) tive (Hessian) matrix xx f (x) to state the nec- essary first- and second-order conditions for un- holds for all α ∈ (0, 1). If φ(x) is differentiable, constrained optimality: then an equivalent definition is: ∗ T ∗ ∇xf (x )=0 p ∇xxf (x ) p≥0 for all p=0 (55) T φ (x1)+∇φ(x1) (x−x1) ≤φ (x) (54) These necessary conditions for local optimality Strict convexity requires that the inequalities 53 can be strengthened to sufficient conditions by and 54 be strict. Convex feasible regions require making the inequality in the relations 55 strict g(x) to be a convex function and h(x) to be linear. (i.e., positive curvature in all directions). Equiv- alently, the sufficient (necessary) curvature con- If 52 is a convex problem, than any local solution ∗ is guaranteed to be a global solution to 52. More- ditions can be stated as: ∇xxf (x ) has all pos- over, if the objective function is strictly convex, itive (nonnegative) eigenvalues and is therefore then this solution x* is unique. On the other defined as a positive (semi-)definite matrix. hand, nonconvex problems may have multiple local solutions, i.e., feasible solutions that min- imize the objective function within some neigh- borhood about the solution. We first consider methods that find only lo- cal solutions to nonconvex problems, as more difficult (and expensive) search procedures are required to find a global solution. Local methods are currently very efficient and have been devel- oped to deal with very large NLPs. Moreover, by considering the structure of convex NLPs (including LPs and QPs), even more powerful methods can be applied. To study these methods, we first consider conditions for local optimality.

Local Optimality Conditions – A Kine- Figure 36. Unconstrained minimum matic Interpretation. Local optimality condi- tions are generally derived from gradient infor- Now consider the imposition of inequality mation from the objective and constraint func- [g(x) ≤ 0] and equality constraints [h(x)=0]in tions. The proof follows by identifying a lo- Figure 37. Continuing the kinematic interpreta- cal minimum point that has no feasible descent tion, the inequality constraints g(x) ≤ 0 act as “fences” in the valley, and equality constraints 98 Mathematics in Chemical Engineering h(x) = 0 as “rails”. Consider now a ball, con- is required. The KKT conditions are derived strained on a rail and within fences, to roll to from gradient information, and the CQ can be its lowest point. This stationary point occurs viewed as a condition on the relative influence of when the normal forces exerted by the fences constraint curvature. In fact, linearly constrained [−∇g (x∗)] and rails [−∇h (x∗)] on the ball are problems with nonempty feasible regions re- balanced by the force of “gravity” [−∇f (x∗)]. quire no constraint qualification. On the other This condition can be stated by the following hand, as seen in Figure 38, the problem Min x1, 3 KKT necessary conditions for constrained opti- s.t. x2 ≥ 0, (x1) ≥ x2 has a minimum point at the mality: origin but does not satisfy the KKT conditions because it does not satisfy a CQ.

Figure 38. Failure of KKT conditions at constrained mini- mum (note linear dependence of constraint gradients) Figure 37. Constrained minimum CQs be defined in several ways. For in- Stationarity Condition: It is convenient to stance, the linear independence constraint qual- define the Lagrangian function L(x, λ, ν)= T λ T ν ification (LICQ) requires that the active con- f (x)+g(x) + h(x) along with “weights” or straints at x* be linearly independent, i.e., the multipliers λ and ν for the constraints. These ∗ ∗ matrix [∇h (x ) |∇gA(x )] is full column rank, multipliers are also known as “dual variables” where gA is the vector of inequality constraints and “shadow prices”. The stationarity condition ∗ with elements that satisfy gA,i (x )=0. With (balance of forces acting on the ball) is then LICQ, the KKT multipliers (λ, ν) are guaranteed given by: to be unique at the optimal solution. The weaker ∇L(x,λ,ν)=∇f (x)+∇h (x) λ+∇g (x) ν=0 (56) Mangasarian – Fromovitz constraint qualifica- tion (MFCQ) requires only that ∇h(x*) have full Feasibility: Both inequality and equality con- column rank and that a direction p exist that sat- T T straints must be satisfied (ball must lie on the isfies ∇h(x*) p = 0 and ∇gA(x*) p > 0. With rail and within the fences): MFCQ, the KKT multipliers (λ, ν) are guaran- h (x)=0,g(x) ≤0 (57) teed to be bounded (but not necessarily unique) at the optimal solution. Additional discussion Complementarity: Inequality constraints are ei- can be found in [169]. ther strictly satisfied (active) or inactive, in Second Order Conditions: As with uncon- which case they are irrelevant to the solution. strained optimization, nonnegative (positive) In the latter case the corresponding KKT multi- curvature is necessary (sufficient) in all of the plier must be zero. This is written as: allowable (i.e., constrained) nonzero directions νT g (x)=0,ν≥0 p. This condition can be stated in several ways. (58) A typical necessary second-order condition re- Constraint Qualification: For a local optimum to quires a point x* that satisfies LICQ and first- satisfy the KKT conditions, an additional regu- order conditions 56–58 with multipliers (λ, ν) larity condition or constraint qualification (CQ) to satisfy the additional conditions given by: Mathematics in Chemical Engineering 99

T ∗ p ∇xxL(x ,λ,ν)p≥0 for all and f (x*) = −1.8421. Checking the second- order conditions 59 leads to: p=0, ∇h(x∗)T p=0, ∇xxL(x∗,λ∗,ν∗)=∇xx [f(x∗)+h(x∗)λ∗+g(x∗)ν∗]= ∗ T ∇gA(x ) p=0 (59)  2    The corresponding sufficient conditions require 2+1/(x1) 1−ν 2.50.068   =   that the inequality in 59 be strict. Note that for the 1−ν 3+1/(x )2 0.068 3.125 example in Figure 36, the allowable directions 2 p span the entire space for x while in Figure 37, [∇h(x∗)|∇gA(x∗)] there are no allowable directions p.   Example 2 −2.8284 : To illustrate the KKT conditions,   consider the following unconstrained NLP: p= p=0,p=0 −1 −1.4142 2 2 Min (x1) −4x1+3/2(x2) Note that LICQ is satisfied. Moreover, because −7x2+x1x2+9−lnx1−lnx2 (60) ∗ ∗ [∇h (x ) |∇gA(x )] is square and nonsingular, corresponding to the contour plot in Figure 36. there are no nonzero vectors p that satisfy the al- The optimal solution can be found by solving lowable directions. Hence, the sufficient second- for the first order conditions 54: pT ∇ L(x∗,λ∗,ν∗)>0 & ' order conditions ( xx , for all 2x −4+x −1/x allowable p) are vacuously satisfied for this ∇f (x)= 1 2 1 =0 3x2−7+x1−1/x2 problem. & ' ∗ 1.3475 ⇒x = (61) Convex Cases of NLP. Linear programs and 2.0470 quadratic programs are special cases of prob- and f (x*) = −2.8742. Checking the second- lem 52 that allow for more efficient solution, order conditions leads to: based on application of KKT conditions 56–59. & ' Because these are convex problems, any locally 2+1/ x∗ 2 1 ∇ f(x∗)= 1 ⇒ xx ∗ 2 optimal solution is a global solution. In partic- 1 3+1/ x2 & ' ular, if the objective and constraint functions in 2.5507 1 ∇ f(x∗)= ( ) problem 52 are linear then the following linear xx 1 3.2387 positive definite (62) program (LP):

Now consider the constrained NLP: Min cT x 2 2 s.t. Ax=b (65) Min (x1) −4x1+3/2(x1) −7x2+x1x2 Cx≤d +9−lnx1−lnx2 (63) s.t. 4−x1x2≤0 can be solved in a finite number of steps, and 2x1−x2=0 the optimal solution lies at a vertex of the poly- that corresponds to the plot in Figure 37. The hedron described by the linear constraints. This optimal solution can be found by applying the is shown in Figure 39, and in so-called primal first-order KKT conditions 56–58: degenerate cases, multiple vertices can be alter- ∇L(x,λ,ν)=∇f (x)+∇h (x) λ+∇g (x) ν= nate optimal solutions with the same values of the objective function. The standard method to       2x1−4+x2−1/x1 2 −x2 solve problem 65 is the simplex method, devel-   +   λ+   ν=0 oped in the late 1940s [170], although, starting 3x2−7+x1−1/x2 −1 −x1 from Karmarkar’s discovery in 1984, interior g (x)=4−x1x2≤0,h(x)=2x1−x2=0 point methods have become quite advanced and competitive for large scale problems [171]. The g (x) ν=(4−x x )ν, ν≥0 1 2 simplex method proceeds by moving succes- ⇓ sively from vertex to vertex with improved ob-   jective function values. Methods to solve prob- 1.4142 lem 65 are well implemented and widely used, x∗=   ,λ∗=1.036,ν∗=1.068 2.8284 100 Mathematics in Chemical Engineering especially in planning and logistical applica- If the matrix Q is positive semidefinite (posi- tions. They also form the basis for MILP meth- tive definite), when projected into the null space ods (see below). Currently, state-of-the-art LP of the active constraints, then problem 66 is solvers can handle millions of variables and con- (strictly) convex and the QP is a global (and straints and the application of further decom- unique) minimum. Otherwise, local solutions position methods leads to the solution of prob- exist for problem 66, and more extensive global lems that are two or three orders of magnitude optimization methods are needed to obtain the larger than this [172, 173]. Also, the interior global solution. Like LPs, convex QPs can be point method is described below from the per- solved in a finite number of steps. However, as spective of more general NLPs. seen in Figure 40, these optimal solutions can lie on a vertex, on a constraint boundary, or in the interior. A number of active set strategies have been created that solve the KKT conditions of the QP and incorporate efficient updates of active constraints. Popular QP methods include null space algorithms, range space methods, and Schur complement methods. As with LPs, QP problems can also be solved with interior point methods [171].

Figure 40. Contour plots of convex quadratic programs

Solving the General NLP Problem. Solu- tion techniques for problem 52 deal with sat- isfaction of the KKT conditions 56–59. Many NLP solvers are based on successive quadratic programming (SQP) as it allows the construction of a number of NLP algorithms based on the Newton – Raphson method for equation solv- ing. SQP solvers have been shown to require the fewest function evaluations to solve NLPs Figure 39. Contour plots of linear programs [174] and they can be tailored to a broad range of process engineering problems with different Quadratic programs (QP) represent a slight structure. modification of problem 65 and can be stated The SQP strategy applies the equivalent of a as: Newton step to the KKT conditions of the non- linear programming problem, and this leads to a cT x+ 1 xT Qx Min 2 fast rate of convergence. By adding slack vari- s.t. Ax=b (66) ables s the first-order KKT conditions can be x≤d rewritten as: Mathematics in Chemical Engineering 101

∇f (x)+∇h (x) λ+∇g (x) ν=0 (67a) step for 67a–67c applied at iteration k. Also, se- lection of the active set is now handled at the QP h (x)=0 (67b) level by satisfying the conditions 69d, 69e. To evaluate and change candidate active sets, QP g (x)+s=0 (67c) algorithms apply inexpensive matrix-updating strategies to the KKT matrix associated with the SV e=0 (67d) QP 68. Details of this approach can be found in [175, 176]. As alternatives that avoid the combinatorial (s,ν) ≥0 (67e) problem of selecting the active set, interior point where e = [1, 1, ...,1]T , S = diag(s) and V = (or barrier) methods modify the NLP problem 52 diag(ν). SQP methods find solutions that sat- to form problem 70  isfy Equations 67a, 67b, 67c, 67d and 67e by f xk −µ ln s Min i i generating Newton-like search directions at it- h xk =0 s.t. (70) eration k. However, equations 67d and active g xk +s=0 bounds 67e are dependent and serve to make the KKT system ill-conditioned near the solu- where the solution to this problem has s > 0 for tion. SQP algorithms treat these conditions in the penalty parameter µ>0, and decreasing µ to two ways. In the active set strategy, discrete de- zero leads to solution of problem 52. The KKT cisions are made regarding the active constraint conditions for this problem can be written as ∗ set, i∈I= {i|gi (x )=0}, and Equation 67d is Equation 71 ∈ ∈ replaced by si =0,i I, and vi =0,i I. Deter- ∇f (x∗)+∇h (x∗) λ+∇g (x∗) ν=0 mining the active set is a combinatorial problem, and a straightforward way to determine an esti- h (x∗)=0 mate of the active set [and also satisfy 67e] is to (71) formulate, at a point xk , and solve the following g (x∗)+s=0 QP at iteration k: SV e=µe

∇f xk T p+ 1 pT ∇ L xk,λk,νk p Min 2 xx and at iteration k the Newton steps to solve 71 T s.t. h xk +∇h xk p=0 (68) are given by: T   g xk +∇g xk p+s=0,s≥0 ∇xxL(xk,λk,νk) ∇h (xk) ∇g (xk)  −1   Sk Vk I  The KKT conditions of 68 are given by:  T       ∇h(xk)  k 2 k k k T ∇f x +∇ L x ,λ ,ν p ∇g(xk) I         +∇h xk λ+∇g xk ν=0 ∆x ∇xL(xk,λk,νk) (69a)    −1   ∆s   νk−S µe    =−  k  (72)    T  ∆λ   h (x )  k k k h x +∇h x p=0 (69b) ∆ν g (xk)+sk

   T k k A detailed description of this algorithm, called g x +∇g x p+s=0 (69c) IPOPT, can be found in [177]. Both active set and interior point methods SV e =0 (69d) have clear trade-offs. Interior point methods may require more iterations to solve problem 70 for (s,ν) ≥0 (69e) various values of µ, while active set methods require the solution of the more expensive QP where the Hessian of the Lagrange function subproblem 68. Thus, if there are few inequality ∇ L (x,λ,ν)=∇ f (x)+h(x)T λ+g(x)T ν xx xx constraints or an active set is known (say from is calculated directly or through a quasi- a good starting guess, or a known QP solution Newtonian approximation (created by differ- from a previous iteration) then solving problem ences of gradient vectors). It is easy to show 68 is not expensive and the active set method that 69a–69c correspond to a Newton–Raphson is favored. On the other hand, for problems with 102 Mathematics in Chemical Engineering many inequality constraints, interior point meth- Min f (z) ods are often faster as they avoid the combina- s.t. c (z)=0 (74) torial problem of selecting the active set. This a≤z≤b is especially the case for large-scale problems and when a large number of bounds are active. Variables are partitioned as nonbasic vari- Examples that demonstrate the performance of ables (those fixed to their bounds), basic vari- these approaches include the solution of model ables (those that can be solved from the predictive control (MPC) problems [178 – 180] equality constraints), and superbasic variables and the solution of large optimal control prob- (those remaining variables between bounds that lems using barrier NLP solvers. For instance, serve to drive the optimization); this leads to zT = zT ,zT ,zT IPOPT allows the solution of problems with N S B . This partition is derived from more than 106 variables and up to 50 000 de- local information and may change over the grees of freedom [181, 182]. course of the optimization iterations. The cor- responding KKT conditions can be written as Other Gradient-Based NLP Solvers. In ad- Equations 75a, 75b, 75c, 75d and 75e dition to SQP methods, a number of NLP solvers ∇Nf (z)+∇Nc (z) γ=βa−βb (75a) have been developed and adapted for large- scale problems. Generally, these methods re- ∇ f (z)+∇ c (z) γ=0 quire more function evaluations than SQP meth- S S (75b) ods, but they perform very well when interfaced to optimization modeling platforms, where func- ∇Bf (z)+∇Bc (z) γ=0 (75c) tion evaluations are cheap. All of these can be derived from the perspective of applying New- c (z)=0 (75d) ton steps to portions of the KKT conditions. LANCELOT [183] is based on the solution zN,j =aj or bj ,βa,j ≥0,βb,j =0 or βb,j ≥0,βa,j =0 (75e) of bound constrained subproblems. Here an aug- mented Lagrangian is formed from problem 52 where λ and β are the KKT multipliers for and subproblem 73 is solved. the equality and bound constraints, respectively, and 75e replaces the complementarity condi- Min f (x)+λT h (x) +ν (g (x)+s)+1 ρg (x) ,g (x)+s2 tions 58. Reduced gradient methods work by 2 (73) nesting equations 75b, 75d within 75a, 75c. At zk zk s.t. s≥0 iteration k, for fixed values of N and S ,we can solve for zB using 75d and for λ using 75b. The above subproblem can be solved very ef- Moreover, linearization of these equations leads ficiently for fixed values of the multipliers λ to constrained derivatives or reduced gradients and v and penalty parameter ρ. Here a gradient- (Eq. 76) projection, trust-region method is applied. Once df −1 =∇fs−∇cs(∇cB) ∇fB (76) subproblem 73 is solved, the multipliers and dz penalty parameter are updated in an outer loop s and the cycle repeats until the KKT conditions which indicate how f (z) (and zB) change with for problem 52 are satisfied. LANCELOT works respect to zS and zN. The algorithm then pro- best when exact second derivatives are avail- ceeds by updating zS using reduced gradients in able. This promotes a fast convergence rate in a Newton-type iteration to solve equation 75c. solving each subproblem and allows a bound Following this, bound multipliers β are calcu- constrained trust-region method to exploit direc- lated from 75a. Over the course of the iter- tions of negative curvature in the Hessian matrix. ations, if the variables zB or zS exceed their Reduced gradient methods are active set bounds or if some bound multipliers β become strategies that rely on partitioning the variables negative, then the variable partition needs to and solving Equations 67a, 67b, 67c, 67d and be changed and the equations 75a, 75b, 75c, 67e in a nested manner. Without loss of gener- 75d and 75e are reconstructed. These reduced ality, problem 52 can be rewritten as problem gradient methods are embodied in the popular 74. GRG2, CONOPT, and SOLVER codes [173]. Mathematics in Chemical Engineering 103

The SOLVER code has been incorporated into not checked by methods that do not use second Microsoft Excel. CONOPT [184] is an efficient derivatives). With the recent development of and widely used code in several optimization automatic differentiation tools, many model- modeling environments. ing and simulation platforms can provide ex- MINOS [185] is a well-implemented package act first and second derivatives for optimiza- that offers a variation on reduced gradient strate- tion. When second derivatives are available gies. At iteration k, equation 75d is replaced by for the objective or constraint functions, they its linearization (Eq. 77) can be used directly in LANCELOT, SQP and,     T T less efficiently, in reduced gradient methods. c(zk ,zk,zk )+∇ c zk (z −zk )+∇ c zk N S B B B B S Otherwise, for problems with few superbasic k (zS−zS )=0 (77) variables, reduced gradient methods and re- duced space variants of SQP can be applied. and (75a–75c, 75e) are solved with Equation 77 Referring to problem 74 with n variables and as a subproblem using concepts from the re- m equalities, we can write the QP step from duced gradient method. At the solution of this problem 68 as Equation 78. subproblem, the constraints 75d are relinearized 1 and the cycle repeats until the KKT conditions of k T 1 T k k Min ∇f z p+ 2 p ∇xxL z ,γ p 75a, 75b, 75c, 75d and 75e are satisfied. The aug- k k T mented Lagrangian function 73 is used to penal- s.t. c z +∇c z p=0 (78) ize movement away from the feasible region. For a≤zk+p≤b problems with few degrees of freedom, the re- sulting approach leads to an extremely efficient Defining the search direction as p = Zk pZ + k T method even for very large problems. MINOS Y k pY , where ∇c(x ) Zk = 0 and [Y k |Zk ]is has been interfaced to a number of modeling a nonsingular n × n matrix, allows us to form systems and enjoys widespread use. It performs the following reduced QP subproblem (with especially well on large problems with few non- n−m variables) ! " linear constraints. However, on highly nonlinear ZT ∇f zk +w T p + 1 pzT B p Min k k Z 2 k Z problems it is usually less reliable than other re- (79) k duced gradient methods. s.t. a≤z +ZkpZ +YkpY ≤b

k T −1 k Algorithmic Details for NLP Methods. All where pY = −[∇c(z ) Y k ] c(z ). Good of the above NLP methods incorporate con- choices of Y k and Zk , which allow sparsity cepts from the Newton – Raphson Method for to be exploited in ∇c(zk ), are: Y T =[0|I]    k   equation solving. Essential features of these T k k −1 and Z = I|−∇N,Sc z ∇Bc z . methods are a) providing accurate derivative in- k formation to solve for the KKT conditions, Here we define the reduced Hessian B =ZT ∇ L zk,γk Z w = b) stabilization strategies to promote conver- k z zz  k and k T k k gence of the Newton-like method from poor Zk ∇zzL z ,γ YkpY . In the absence of starting points, and c) regularization of the Ja- second-derivative information, Bk can be cobian matrix in Newton’s method (the so- approximated using positive definite quasi- called KKT matrix) if it becomes singular or Newton approximations [175]. Also, for the ill-conditioned. interior point method, a similar reduced space decomposition can be applied to the Newton a) Providing first and second derivatives: The step given in 72. KKT conditions require first derivatives to de- Finally, for problems with least-squares func- fine stationary points, so accurate first deriva- tions, as in data reconciliation, parameter es- tives are essential to determine locally optimal timation, and model predictive control, one solutions for differentiable NLPs. Moreover, can often assume that the values of the objec- Newton–Raphson methods that are applied to tive function and its gradient at the solution the KKT conditions, as well as the task of are vanishingly small. Under these conditions, checking second-order KKT conditions, nec- one can show that the multipliers (λ, ν) also essarily require information on second deriva- vanish and ∇xx L(x*, λ, ν) can be substituted tives. (Note that second-order conditions are 104 Mathematics in Chemical Engineering

by ∇xx f (x*). This Gauss – Newton approxi- leads to better performance than those based mation has been shown to be very efficient for on merit functions. the solution of least-squares problems [175]. c) Regularization of the KKT matrix for the NLP b) Line-search and trust-region methods are subproblem (e.g., in Equation 72) is essen- used to promote convergence from poor start- tial for good performance of general purpose ing points. These are commonly used with algorithms. For instance, to obtain a unique the search directions calculated from NLP solution to Eqn. 72, active constraint gradi- subproblems such as problem 68. In a trust- ents must be full rank, and the Hessian matrix, region approach, the constraint, p≤∆ is when projected into the null space of the ac- added and the iteration step is taken if there tive constraint gradients, must be positive def- is sufficient reduction of some merit func- inite. These properties may not hold far from tion (e.g., the objective function weighted the solution, and corrections to the Hessian in with some measure of the constraint viola- SQP may be necessary [176]. Regularization tions). The size of the trust region ∆ is ad- methods ensure that subproblems like Eqns. justed based on the agreement of the reduc- 68 and 72 remain well-conditioned; they in- tion of the actual merit function compared clude addition of positive constants to the di- to its predicted reduction from the subprob- agonal of the Hessian matrix to ensure its pos- lem [183]. Such methods have strong global itive definiteness, judicious selection of active convergence properties and are especially ap- constraint gradients to ensure that they are lin- propriate for ill-conditioned NLPs. This ap- early independent, and scaling the subprob- proach has been applied in the KNITRO code lem to reduce the propagation of numerical er- [186]. Line-search methods can be more effi- rors. Often these strategies are heuristics built cient on problems with reasonably good start- into particular NLP codes. While quite effec- ing points and well-conditioned subproblems, tive, most of these heuristics do not provide as in real-time optimization. Typically, once convergence guarantees for general NLPs. a search direction is calculated from problem 68, or other related subproblem, a step size α Table 11 summarizes the characteristics of ∈ (0, 1] is chosen so that xk + αp leads to a a collection of widely used NLP codes. Much sufficient decrease of a merit function. As a more information on widely available codes recent alternative, a novel filter-stabilization can also be found on the NEOS server (www- strategy (for both line-search and trust-region neos.mcs.anl.gov) and the NEOS Software approaches) has been developed based on a Guide. bicriterion minimization, with the objective function and constraint infeasibility as com- peting objectives [187]. This method often

Table 11. Representative NLP solvers Method Algorithm type Stabilization Second-order information CONOPT [184] reduced gradient line search exact and quasi-Newton GRG2 [173] reduced gradient line search quasi-Newton IPOPT [177] SQP, barrier line search exact KNITRO [186] SQP, barrier trust region exact and quasi-Newton LANCELOT [183] aug mented Lagrangian, bound trust region exact and quasi-Newton constrained LOQO [188] SQP, barrier line search exact MINOS [185] reduced gradient, augmented line search quasi-Newton Lagrangian NPSOL [189] SQP, Active set line search quasi-Newton SNOPT [190] reduced space SQP, active set line search quasi-Newton SOCS [191] SQP, active set line search exact SOLVER [173] reduced gradient line search quasi-Newton SRQP [192] reduced space SQP, active set line search quasi-Newton Mathematics in Chemical Engineering 105 10.3. Optimization Methods without Derivatives

A broad class of optimization strategies does not require derivative information. These methods have the advantage of easy implementation and little prior knowledge of the optimization prob- lem. In particular, such methods are well suited for “quick and dirty” optimization studies that explore the scope of optimization for new prob- lems, prior to investing effort for more sophisti- cated modeling and solution strategies. Most of these methods are derived from heuristics that naturally spawn numerous variations. As a re- sult, a very broad literature describes these meth- ods. Here we discuss only a few important trends in this area.

Classical Direct Search Methods. Devel- oped in the 1960s and 1970s, these methods in- clude one-at-a-time search and methods based on experimental designs [193]. At that time, these direct search methods were the most popu- lar optimization methods in chemical engineer- ing. Methods that fall into this class include the pattern search [194], the conjugate direction method [195], simplex and complex searches [196], and the adaptive random search methods [197 – 199]. All of these methods require only objective function values for unconstrained min- imization. Associated with these methods are numerous studies on a wide range of process problems. Moreover, many of these methods in- Figure 41. Examples of optimization methods without clude heuristics that prevent premature termina- derivatives tion (e.g., directional flexibility in the complex A) Pattern search method; B) Random search method search as well as random restarts and direction Circles: 1st phase; Triangles: 2nd phase; Stars: 3rd phase generation). To illustrate these methods, Figure 41 illustrates the performance of a pattern search accepted with x’ as the new point. Otherwise, θ method as well as a random search method on x’ is accepted with probability p( ,x’,x). Op- θ an unconstrained problem. tions include the Metropolis distribution, p( ,x, −{ − } θ Simulated Annealing. This strategy is related x’) = exp[ f (x’) f (x) / ], and the Glauber θ −{ − } to random search methods and derives from a distribution, p( , x, x’) = exp[ f (x’) f (x) /(1 −{ − } θ θ class of heuristics with analogies to the motion + exp[ f (x’) f (x) ]/ )]. The parameter is of molecules in the cooling and solidification then reduced and the method continues until no of metals [200]. Here a “temperature” param- further progress is made. eter θ can be raised or lowered to influence Genetic Algorithms. This approach, first pro- the probability of accepting points that do not posed in [201], is based on the analogy of im- improve the objective function. The method proving a population of solutions through mod- starts with a base point x and objective value ifying their gene pool. It also has similar perfor- f (x). The next point x’ is chosen at random mance characteristics as random search meth- from a distribution. If f (x’) < f (x), the move is ods and simulated annealing. Two forms of ge- 106 Mathematics in Chemical Engineering netic modification, crossover or mutation, are and FOCUS developed at Boeing Corporation used and the elements of the optimization vec- [205]. tor x are represented as binary strings. Crossover Direct search methods are easy to apply to deals with random swapping of vector elements a wide variety of problem types and optimiza- (among parents with highest objective function tion models. Moreover, because their termina- values or other rankings of population) or any tion criteria are not based on gradient informa- linear combinations of two parents. Mutation tion and stationary points, they are more likely deals with the addition of a random variable to el- to favor the search for global rather than locally ements of the vector. Genetic algorithms (GAs) optimal solutions. These methods can also be have seen widespread use in process engineering adapted easily to include integer variables. How- and a number of codes are available. A related ever, rigorous convergence properties to glob- GA algorithm is described in [173]. ally optimal solutions have not yet been disco- Derivative-Free Optimization (DFO). In the vered. Also, these methods are best suited for past decade, the availability of parallel com- unconstrained problems or for problems with puters and faster computing hardware and the simple bounds. Otherwise, they may have diffi- need to incorporate complex simulation models culties with constraints, as the only options open within optimization studies have led a number of for handling constraints are equality constraint optimization researchers to reconsider classical elimination and addition of penalty functions for direct search approaches. In particular, Dennis inequality constraints. Both approaches can be and Torczon [202] developed a multidimen- unreliable and may lead to failure of the opti- sional search algorithm that extends the simplex mization algorithm. Finally, the performance of approach [196]. They note that the Nelder– direct search methods scales poorly (and often Mead algorithm fails as the number of variables exponentially) with the number of decision vari- increases, even for very simple problems. To ables. While performance can be improved with overcome this, their multidimensional pattern- the use of parallel computing, these methods are search approach combines reflection, expansion, rarely applied to problems with more than a few and contraction steps that act as line search algo- dozen decision variables. rithms for a number of linear independent search directions. This approach is easily adapted to parallel computation and the method can be 10.4. Global Optimization tailored to the number of processors available. Moreover, this approach converges to locally Deterministic optimization methods are avail- optimal solutions for unconstrained problems able for nonconvex nonlinear programming and exhibits an unexpected performance syn- problems of the form problem 52 that guaran- ergy when multiple processors are used. The tee convergence to the global optimum. More work of Dennis and Torczon [202] has spawned specifically, one can show under mild conditions considerable research on analysis and code de- that they converge to an ε distance to the global velopment for DFO methods. Moreover, Conn optimum on a finite number of steps. These et al. [203] construct a multivariable DFO al- methods are generally more expensive than local gorithm that uses a surrogate model for the NLP methods, and they require the exploitation objective function within a trust-region method. of the structure of the nonlinear program. Here points are sampled to obtain a well-defined Global optimization of nonconvex programs quadratic interpolation model, and descent con- has received increased attention due to their ditions from trust-region methods enforce con- practical importance. Most of the deterministic vergence properties. A number of trust-region global optimization algorithms are based on spa- methods that rely on this approach are reviewed tial branch-and-bound algorithm [206], which in [203]. Moreover, a number of DFO codes divides the feasible region of continuous vari- have been developed that lead to black-box op- ables and compares lower bound and upper timization implementations for large, complex bound for fathoming each subregion. The one simulation models. These include the DAKOTA that contains the optimal solution is found by package at Sandia National Lab [204], eliminating subregions that are proved not to http://endo.sandia.gov/DAKOTA/software.html contain the optimal solution. Mathematics in Chemical Engineering 107

For nonconvex NLP problems, Quesada and containing the feasible region) into subregions Grossmann [207] proposed a spatial branch- (see Fig. 42). Upper bounds on the objective and-bound algorithm for concave separable, lin- function are computed over all subregions of ear fractional, and bilinear programs using lin- the problem. In addition, lower bounds can be ear and nonlinear underestimating functions derived from convex underestimators of the ob- [208]. For nonconvex MINLP, Ryoo and Sahini- jective function and constraints for each subre- dis [209] and later Tawarmalani and Sahini- gion. The algorithm then proceeds to eliminate dis [210] developed BARON, which branches all subregions that have lower bounds that are on the continuous and discrete variables with greater than the least upper bound. After this, the bounds reduction method. Adjiman et al. [211, remaining regions are further partitioned to cre- 212] proposed the SMIN-αBB and GMIN-αBB ate new subregions and the cycle continues until algorithms for twice-differentiable nonconvex the upper and lower bounds converge. Below MINLPs. Using a valid convex underestima- we illustrate the specific steps of the algorithm tion of general functions as well as for special for nonlinear programs that involve bilinear, lin- functions, Adjiman et al. [213] developed the ear fractional, and concave separable terms [207, αBB method, which branches on both the con- 214]. tinuous and discrete variables according to spe- cific options. The branch-and-contract method Nonconvex NLP with Bilinear, Linear [214] has bilinear, linear fractional, and concave Fractional, and Concave Separable Terms. separable functions in the continuous variables Consider the following specific nonconvex NLP and binary variables, uses bound contraction, problem, and applies the outer-approximation (OA) algo-   xi Min f (x)= aij xixj + bij rithm at each node of the tree. Smith and Pan- x (i,j)∈BL0 (i,j)∈LF0 xj telides [215] proposed a reformulation method  + gi (xi)+h (x) combined with a spatial branch-and-bound al- i∈C0 gorithm for nonconvex MINLP and NLP. subject to   (80) xi fk (x)= aijkxixj + bijk (i,j)∈BLk (i,j)∈LFk xj  + g (xi)+h (x) ≤0 k∈K i∈Ck i,k k

n x∈S∩Ω0⊂R

where aij , aijk , bij , bijk are scalars with i∈I={1,2,···,n}, j∈J={1,2,···,n}, and k∈K= {1,2,···,m}. BL0,BLk,LF0,LFk are (i, j)-index sets, with i =j, that define the bi- linear and linear fractional terms present in the problem. The functions h (x) ,hk (x) are con- vex, and twice continuously differentiable. C0 and Ck are index sets for the univariate twice continuously differentiable concave functions n gi (xi) ,gi,k (xi). The set S⊂R is convex, and n ω0⊂R is an n-dimensional hyperrectangle de- fined in terms of the initial variable bounds xL,in and xU,in:

n L,in U,in L,in Figure 42. Convex underestimator for nonconvex function Ω0={x∈R :0≤x ≤x≤x ,xj >0 if (i,j) ∈LF0∪LFk,i∈I,j∈J,k∈K} Because in global optimization one cannot exploit optimality conditions like the KKT con- The feasible region of problem 80 is denoted by D. Note that a nonlinear equality constraint ditions for a local optimum, these methods work f (x)=0 by first partitioning the problem domain (i.e., of the form k can be accommodated in 108 Mathematics in Chemical Engineering

BL+= {(i,j):(i,j)∈BL ∪BL ,a >0 problem 51 through the representation by the in- 0 2 k ij a >0,k∈K equalities fk (x) ≤0 and −fk (x) ≤0, provided or ijk h (x) k is separable. BL−= {(i,j):(i,j)∈BL ∪BL ,a <0 To obtain a lower bound LB (Ω) for the 0 2 k ij or aijk<0,k∈K global minimum of problem 80 over D∩Ω, n L U where Ω={x∈R :x ≤x≤x }⊆Ω0, the fol- The inequalities 84 were first derived by Mc- lowing problem is proposed: Cormick [208], and along with the inequalities ˆ  85 theoretically characterized by Al-Khayyal Min f (x,y,z)= (i,j)∈BL aij yij (x,y,z) 0 and Falk [217, 218].   c) z= {zij} is a vector of additional variables for + bij zij + gˆi (xi)+h (x) (i,j)∈LF0 i∈C0 relaxing the linear fractional terms in problem subject to 80; these variables are used in the following inequalities: ˆ   f (x,y,z)= a yij + b zij   k (i,j)∈BLk ijk (i,j)∈LFk ijk z ≥ xi +xU 1 − 1 (i,j)∈LF +  ij L i x L xj j xj + gˆ (xi)+h (x) ≤0 k∈K   i∈Ck i,k k (86) xi L 1 1 + zij ≥ +x − (i,j)∈LF xU i xj xU (x,y,z) ∈T (Ω) ⊂Rn×Rn1 ×Rn2 j j  * 2 L U n n1 n2 xi+ x x x∈S∩Ω⊂R ,y∈R+ ,z∈R+ , 1  i i  + (81) zij ≥  * *  (i,j)∈LF (87) xj L U xi + xi where the functions and sets are defined as fol-   lows: 1 U L L L − zij ≤ L U xj xi−xi xj +xi xj (i,j) ∈LF xj xj   a) gˆi (xi) and gˆi,k (xi) are the convex envelopes z ≤ 1 xLx −xUx +xUxU (i,j) ∈LF − ij xLxU j i i j i j for the univariate functions over the domain j j L U xi∈[xi ,xi ] [216]: (88)

gˆi (xi)= where ( )   g xU −g xL   + g xL + i i i i x −xL LF = {(i,j):(i,j)∈LF0∪LFk,bij >0 i i U L i i xi −xi 2 or bijk>0,k∈K ≤gi (xi) (82) − LF = {(i,j):(i,j)∈LF0∪LFk,bij <0 2 gˆi,k (xi)= ( ) or bijk<0,k∈K   g xU −g xL   g xL + i,k i i,k i x −xL i,k i xU−xL i i The inequalities 86 and 87, 88 are convex un- i i derestimators due to Quesada and Grossmann ≤g (x ) i,k i (83) [207, 219] and Zamora and Grossmann [214], gˆ (x )=g (x ) x =xL x =xU respectively. where i i i i at i i , and i i ; n n n gˆ (x )=g (x ) x =xL d) T (Ω)={(x,y,z) ∈R ×R 1 ×R 2 : 82– 87 likewise, i,k i i,k i at i i , and L U U are satisfied with x ,x as in Ω}. The fea- xi=xi . sible region, and the solution of problem 52 b) y= {yij} is a vector of additional variables for are denoted by M (Ω), and (ˆx,y,ˆ zˆ)Ω, respec- relaxing the bilinear terms in 80, and is used ε (Ω) in the following inequalities which determine tively. We define the approximation gap at a branch-and-bound node as the convex and concave envelopes of bilinear  terms:  ∞ if OUB=∞ ε (Ω)= −LB (Ω) =0 y ≥xLx +xLx −xLxL (i,j)∈BL+  if OUB (89) ij j i i j i j  (OUB−LB(Ω)) U U U U + (84) | | otherwise yij ≥xj xi+xi xj −xi xj (i,j)∈BL OUB L U U L − where the overall upper bound (OUB) is the yij ≤xj xi+xi xj −xi xj (i,j)∈BL U L L U − (85) value of f (x) at the best available feasible yij ≤x xi+x xj −x x (i,j)∈BL j i i j point x∈D; if no feasible point is available, where then OUB=∞. Mathematics in Chemical Engineering 109

Note that the underestimating problem 81 is This basic concept in spatial branch-and- a linear program if LF +=∅. During the ex- bound for global optimization is as follows. ecution of the spatial branch-and-bound algo- Bounds are related to the calculation of upper rithm, problem (NBNC) is solved initially over and lower bounds. For the former, any feasi- M (Ω0) (root node of the branch-and-bound ble point or, preferably, a locally optimal point tree). If a better approximation is required, in the subregion, can be used. For the lower M (Ω0) is refined by partitioning Ω0 into two bound, convex relaxations of the objective and smaller hyperrectangles Ω01 and Ω02, and two constraint function are derived such as in prob- children nodes are created with relaxed feasible lem 81. The refining step deals with the construc- regions given by M (Ω01) and M (Ω02). Prob- tion of partitions in the domain and further parti- lem 81 might be regarded as a basic underesti- tioning them during the search process. Finally, mating program for the general problem 80. In the selection step decides on the order of ex- some cases, however, it is possible to develop ad- ploring the open subregions. Thus, the feasible ditional convex estimators that might strengthen region and the objective function are replaced the underestimating problem. See, for instance, by convex envelopes to form relaxed problems. the projections proposed by Quesada and Gross- Solving these convex relaxed problems leads to mann [207], the reformulation–linearization global solutions that are lower bounds to the technique by Sherali and Alameddine [220], NLP in the particular subregion. Finally, we see and the reformulation–convexification approach that gradient-based NLP solvers play an impor- by Sherali and Tuncbilek [221]. tant role in global optimization algorithms, as The Set of Branching Variables. A set of they often yield the lower and upper bounds for branching variables, characterized by the index the subregions. The following spatial branch- set BV (Ω) defined below, is determined by con- and-bound global optimization algorithm can sidering the optimal solution (ˆx,y,ˆ zˆ)Ω of the un- therefore be given by the following steps: derestimating problem: 0. Initialize algorithm: calculate upper bound by

BV (Ω)={i,j: |yˆij −xˆixˆj | =ζl or |zˆij −xˆi/xˆj | =ζl obtaining a local solution to problem 80. Cal- culate a lower bound solving problem 81 over or gi (ˆxi) −gˆi (ˆxi)=ζl or gi,k (ˆxi) −gˆi,k (ˆxi)=ζl, the entire (relaxed) feasible region Ω0. for i∈I,j∈J,k∈K,l∈L} (90) For iteration k with a set of partitions Ωk,j l where, for a prespecified number n, and bounds in each subregion OLBj and OUBj : L={1,2,. . .,ln} and ζ1 is the magnitude of the largest approximation error for a nonconvex 1) Bound: Define best upper bound: OUB = Minj OUBj and delete (fathom) all subre- term in problem 80 evaluated at (ˆx,y,ˆ zˆ)Ω: gions j with lower bounds OLBj ≥ OUB. If ξ1=Max [|yˆij −xˆixˆj | , |zˆij −xˆi/xˆj | ,gi (ˆxi) −gˆi (ˆxi) , OLBj ≥ OUB − ε, stop. " 2) Refine: Divide the remaining active subre- g (ˆx ) −gˆ (ˆx ) i,k i i,k i gions into partitiions Ωk,j1 and Ωk,j2. (Sev- i∈I,j∈J,k∈K eral branching rules are available for this step.) Similarly, we define ξl<ξl−1 with l∈L\{1} as 3) Select: Solve the convex NLP 81 in the new the l-th largest magnitude for an approximation partitions to obtain OLBj 1 and OLBj 2. Delete error; for instance, ξ2<ξ1 is the second largest partition if no feasible solution. magnitude for an approximation error. Note that 4) Update: Obtain upper bounds, OUBj 1 and in some cases it might be convenient to intro- OUBj 2 to new partitions if present. Set k = duce weights in the determination of BV (Ω) k + 1, update partition sets and go to step 1. in order to scale differences in the approxima- Example: To illustrate the spatial branch- tion errors or to induce preferential branching and-bound algorithm, consider the global solu- schemes. This might be particularly useful in tion of: applications where specific information can be 4 3 exploited by imposing an order of precedence Min f (x)=5/2 x −20 x +55 x2−57 x on the set of complicating variables. (91) s.t. 0.5≤x≤2.5 110 Mathematics in Chemical Engineering

cussion of all of these options can be found in [222 – 224]. Also, a number of efficient global optimization codes have recently been developed, including αBB, BARON, LGO, and OQNLP. An interesting numerical comparison of these and other codes can be found in [225].

10.5. Mixed Integer Programming

Mixed integer programming deals with both dis- crete and continuous decision variables. For sim- plicity in the presentation we consider the most Figure 43. Global optimization example with partitions common case where the discrete decisions are binary variables, i.e., yi = 0 or 1, and we con- As seen in Figure 43, this problem has local so- sider the mixed integer problem 51. Unlike lo- lutions at x* = 2.5 and at x* = 0.8749. The latter cal optimization methods, there are no optimal- is also the global solution with f (x*) = −19.7. ity conditions, like the KKT conditions, that can To find the global solution we note that all but be applied directly. the −20 x3 term in problem 91 are convex, so we replace this term by a new variable and a lin- Mixed Integer Linear Programming. If the ear underestimator within a particular subregion, objective and constraint functions are all lin- i.e.: ear in problem 51, and we assume 0–1 binary 4 Min fL (x)=5/2 x −20 w+55 x2−57 x variables for the discrete variables, then this gives rise to a mixed integer linear programming s.t. x1≤x≤xu (92) (MILP) problem given by Equation 93. 3 (xu−x) 3 (x−xl) w=(xl) +(xu) T T (xu−xl) (xu−xl) Min Z=a x+c y In Figure 43 we also propose subregions that are s.t. Ax+By≤b (93) created by simple bisection partitioning rules, and we use a “loose” bounding tolerance of x≥0,y∈{0,1}t ε=0.2. In each partition the lower bound, f L is As is well known, the (MILP) problem is NP- determined by problem 92 and the upper bound hard. Nevertheless, an interesting theoretical re- f U is determined by the local solution of the orig- sult is that it is possible to transform it into an LP inal problem in the subregion. Figure 44 shows with the convexification procedures proposed by the progress of the spatial branch-and-bound Lovacz and Schrijver [226], Sherali and Adams algorithm as the partitions are refined and the [227], and Balas et al. [228]. These procedures bounds are updated. In Figure 43, note the defi- consist of sequentially lifting the original re- nitions of the partitions for the nodes, and the se- laxed x − y space into higher dimension and pro- quence numbers in each node that show the order jecting it back to the original space so as to yield, in which the partitions are processed. The grayed after a finite number of steps, the integer convex partitions correspond to the deleted subregions hull. Since the transformations have exponential and at termination of the algorithm we see that complexity, they are only of theoretical interest, f Lj ≥ f U − ε (i.e., −19.85 ≥−19.7−0.2), with although they can be used as a basis for deriving the gray subregions in Figure 43 still active. Fur- cutting planes (e.g. lift and project method by ther partitioning in these subregions will allow [228]). the lower and upper bounds to converge to a As for the solution of problem (MILP), it tighter tolerance. should be noted that this problem becomes an A number of improvements can be made to LP problem when the binary variables are re- the bounding, refinement, and selection strate- laxed as continuous variables 0 ≤ y ≤ 1. The gies in the algorithm that accelerate the con- most common solution algorithms for problem vergence of this method. A comprehensive dis- Mathematics in Chemical Engineering 111

Figure 44. Spatial branch-and-bound sequence for global optimization example

(MILP) are LP-based branch-and-bound meth- MILPs that can be solved with efficient special- ods, which are enumeration methods that solve purpose methods are the knapsack problem, the LP subproblems at each node of the search tree. set-covering and set-partitioning problems, and This technique was initially conceived by Land the traveling salesman problem. See [237] for a and Doig [229], Balas [230], and later formal- detailed treatment of these problems. ized by Dakin, [231]. Cutting-plane techniques, The branch-and-bound algorithm for solv- which were initially proposed by Gomory [232], ing MILP problems [231] is similar to the spa- and consist of successively generating valid in- tial branch-and-bound method of the previous equalities that are added to the relaxed LP, have section that explores the search space. As seen received renewed interest through the works of in Figure 45, binary variables are successively Crowder et al. [233], van Roy and Wolsey [234], fixed to define the search tree and a number of and especially the lift-and-project method of bounding properties are exploited in order to Balas et al. [228]. A recent review of branch- fathom nodes in to avoid exhaustive enumera- and-cut methods can be found in [235]. Finally, tion of all the nodes in the tree. Benders decomposition [236] is another tech- nique for solving MILPs in which the problem is successively decomposed into LP subproblems for fixed 0–1 and a master problem for updating the binary variables.

LP-Based Branch and Bound Method. We briefly outline in this section the basic ideas behind the branch-and-bound method for solv- ing MILP problems. Note that if we relax the t binary variables by the inequalities 0 ≤ y ≤ 1 then 93 becomes a linear program with a (global) solution that is a lower bound to the Figure 45. Branch and bound sequence for MILP example MILP 93. There are specific MILP classes in which the LP relaxation of 93 has the same The basic idea in the search is as follows. The solution as the MILP. Among these problems top, or root node, in the tree is the solution to the is the well-known assignment problem. Other linear programming relaxation of 93. If all the 112 Mathematics in Chemical Engineering y variables take on 0–1 values, the MILP prob- commonly used: branch on the dichotomy 0–1 lem is solved, and no further search is required. at each node (i.e., like breadth-first), but expand If at least one of the binary variables yields a as in depth-first. Additional description of these fractional value, the solution of the LP relax- strategies can be found in [237]. ation yields a lower bound to problem 93. The Example: To illustrate the branch-and-bound search then consists of branching on that node approach, we consider the MILP: by fixing a particular binary variable to 0 or 1, Z=x+y +2y +3y and the corresponding restricted LP relaxations Min 1 2 3 are solved that in turn yield a lower bound for s.t.−x+3y1+y2+2y3≤0 (94) any of their descendant nodes. In particular, the −4y −8y −3y ≤−10 following properties are exploited in the branch- 1 2 3 and-bound search: x≥0,y1,y2,y3={0,1} • Any node (initial, intermediate, leaf node) that The solution to problem 94 is given by x =4,y1 = leads to feasible LP solution corresponds to a 1, y2 =1,y3 = 0, and Z = 7. Here we use a depth- valid upper bound to the solution of the MILP first strategy and branch on the variables closest problem 93. to zero or one. Figure 45 shows the progress of • Any intermediate node with an infeasible LP the branch-and-bound algorithm as the binary solution has infeasible leaf nodes and can be variables are selected and the bounds are up- fathomed (i.e., all remaining children of this dated. The sequence numbers for each node in node can be eliminated). Figure 45 show the order in which they are pro- • If the LP solution at an intermediate node is cessed. The grayed partitions correspond to the not less than an existing integer solution, then deleted nodes and at termination of the algo- the node can be fathomed. rithm we see that Z = 7 and an integer solution These properties lead to pruning of the nodes is obtained at an intermediate node where coin- in the search tree. Branching then continues in cidentally y3 =0. the tree until the upper and lower bounds con- Mixed integer linear programming (MILP) verge. methods and codes have been available and The basic concepts outlined above lead to a applied to many practical problems for more branch-and-bound algorithm with the following than twenty years (e.g., [237]. The LP-based features. LP solutions at intermediate nodes are branch-and-bound method [231] has been im- relatively easy to calculate since they can be ef- plemented in powerful codes such as OSL, fectively updated with the dual simplex method. CPLEX, and XPRESS. Recent trends in MILP The selection of binary variables for branching, include the development of branch-and-price known as the branching rule, is based on a num- [238] and branch-and-cut methods such as the ber of different possible criteria; for instance, lift-and-project method [228] in which cutting choosing the fractional variable closest to 0.5, or planes are generated as part of the branch-and- the one involving the largest of the smallest pseu- bound enumeration. See also [235] for a recent docosts for each fractional variable. Branching review on MILP. A description of several MILP strategies to navigate the tree take a number of solvers can also be found in the NEOS Software forms. More common depth-first strategies ex- Guide. pand the most recent node to a leaf node or infea- Mixed Integer Nonlinear Programming The sible node and then backtrack to other branches most basic form of an MINLP problem when in the tree. These strategies are simple to pro- represented in algebraic form is as follows: gram and require little storage of past nodes. On min z=f (x,y) the other hand, breadth-first strategies expand all the nodes at each level of the tree, select the s.t. gj (x,y) ≤0 j∈J (P1) (95) node with the lowest objective function, and then x∈X,y∈Y proceed until the leaf nodes are reached. Here, more storage is required, but generally fewer where f (·), g(·) are convex, differentiable func- nodes are evaluated than in depth-first search. tions, J is the index set of inequalities, and In practice, a combination of both strategies is x and y are the continuous and discrete Mathematics in Chemical Engineering 113 variables, respectively. The set X is com- b) NLP subproblem for fixed yk : monly assumed to be a convex compact set, Zk = f(x,yk)  n  Min U e.g., X= x|x∈R ,Dx≤d, xL≤x≤xU ; the g (x,yk) ≤ 0 j∈J discrete set Y corresponds to a polyhedral s.t. j (97) m set of integer points, Y = {y|x∈Z ,Ay≤a}, x∈X which in most applications is restricted to k which yields an upper bound Z to problem 0–1 values, y ∈{0,1}m . In most applications U 95 provided problem 97 has a feasible solu- of interest the objective and constraint func- tion. When this is not the case, we consider tions f (·), g(·) are linear in y (e.g., fixed the next subproblem: cost charges and mixed-logic constraints): k c) Feasibility subproblem for fixed y : f (x,y)=cT y+r (x) ,g(x,y)=By+h (x). u Methods that have addressed the solu- Min k tion of problem 95 include the branch- s.t. gj (x,y )≤uj∈J (98) and-bound method (BB) [239 – 243], gener- x∈X,u∈R1 alized Benders decomposition (GBD) [244], outer-approximation (OA) [245 – 247], LP/NLP which can be interpreted as the minimization based branch-and-bound [248], and extended of the infinity-norm measure of infeasibility cutting plane method (ECP) [249]. of the corresponding NLP subproblem. Note There are three basic NLP subproblems that that for an infeasible subproblem the solution can be considered for problem 95: of problem 98 yields a strictly positive value of the scalar variable u. a) NLP relaxation

k Min ZLB =f (x,y)

s.t. gj (x.y) ≤0 j∈J

x∈X,y∈YR (NLP1) (96)

k k yi≤αi i∈IFL k k yi≥βi i∈IFU

where Y R is the continuous relaxation of Ik ,Ik the set Y, and FL FU are index sub- sets of the integer variables yi , i ∈ I which are restricted to lower and upper bounds k k αi ,βi at the k-th step of a branch- and-bound enumeration procedure. Note k l k m that αi =yi,βi =yi , l

The convexity of the nonlinear functions is BB method starts by solving first the continu- exploited by replacing them with supporting hy- ous NLP relaxation. If all discrete variables take perplanes, which are generally, but not necessar- integer values the search is stopped. Otherwise, ily, derived at the solution of the NLP subprob- a tree search is performed in the space of the K K lems. In particular, the new values y (or (x , integer variables yi , i ∈I. These are successively yK ) are obtained from a cutting-plane MILP fixed at the corresponding nodes of the tree, giv- problem that is based on the K points (xk , yk ), k ing rise to relaxed NLP subproblems of the form =1...K, generated at the K previous steps: (NLP1) which yield lower bounds for the sub- problems in the descendant nodes. Fathoming of ZK =α Min L  k k nodes occurs when the lower bound exceeds the st α≥f(x ,y )     current upper bound, when the subproblem is in- x−xk   feasible, or when all integer variables yi take on k k T    +∇f(x ,y )  y−yk  discrete values. The last-named condition yields k=1,...K g (xk,yk)  an upper bound to the original problem. j     x−xk   +∇g (xk,yk)T   ≤0 j∈Jk  j  y−yk x∈X,y∈Y (99) where Jk ⊆ J. When only a subset of lineariza- tions is included, these commonly correspond to violated constraints in problem 95. Alterna- tively, it is possible to include all linearizations in problem 99. The solution of 99 yields a valid k lower bound ZL to problem 95. This bound is nondecreasing with the number of linearization points K. Note that since the functions f (x,y) and g(x,y) are convex, the linearizations in problem 99 correspond to outer approximations of the nonlinear feasible region in problem 95. A ge- ometrical interpretation is shown in Figure 46, where it can be seen that the convex objective function is being underestimated, and the con- vex feasible region overestimated with these lin- earizations. Figure 47. Major steps in the different algorithms

Algorithms. The different methods can be The BB method is generally only attractive if classified according to their use of the subprob- the NLP subproblems are relatively inexpensive lems 96–98, and the specific specialization of to solve, or when only few of them need to be the MILP problem 99, as seen in Figure 47. In solved. This could be either because of the low the GBD and OA methods (case b), as well in the dimensionality of the discrete variables, or be- LP/NLP-based branch-and-bound mehod (case cause the integrality gap of the continuous NLP d), problem 98 is solved if infeasible subprob- relaxation of 95 is small. lems are found. Each of the methods is explained next in terms of the basic subproblems. Outer Approximation [245 – 247]. The OA method arises when NLP subproblems 97 and MILP master problems 99 with Jk = J are Branch and Bound. While the earlier work solved successively in a cycle of iterations to in branch and bound (BB) was aimed at lin- generate the points (xk , yk ). Since the master ear problems [231] this method can also be ap- problem 99 requires the solution of all feasi- plied to nonlinear problems [239 – 243]. The ble discrete variables yk , the following MILP Mathematics in Chemical Engineering 115 relaxation is considered, assuming that the so- infeasible the choice of the previous 0–1 values lution of K different NLP subproblems (K = generated at the K previous iterations [245]:   |KFS ∪ KIS|), where KFS is a set of solutions   y − y ≤ Bk −1 k=1,...K k i k i (101) from problem 97, and KIS set of solutions from i∈B i∈N    problem 98 is available: k k k k where B = i|yi =1 ,N = i|yi =0 ,k=1, K ...K. This cut becomes very weak as the dimen- Min ZL =α  sionality of the 0–1 variables increases. How- k k st α≥f(x ,y )  ever, it has the useful feature of ensuring that    k  new 0–1 values are generated at each major it- x−x  k k T    +∇f(x ,y )  eration. In this way the algorithm will not return y−yk  k=1,...K to a previous integer point when convergence is g (xk,yk)  j  achieved. Using the above integer cut the termi-    K K x−xk  nation takes place as soon as ZL ≥UB .  +∇g (xk,yk)T   ≤0 j∈J  j  The OA algorithm trivially converges in one y−yk iteration if f (x,y) and g(x,y) are linear. This prop- x∈X,y∈Y erty simply follows from the fact that if f (x,y) (100) and g(x,y) are linear in x and y the MILP master problem 100 is identical to the original problem Given the assumption on convexity of the func- 95. It is also important to note that the MILP tions f (x,y) and g(x,y), it can be proved that the master problem need not be solved to optimal- Zk solution of problem 100 L corresponds to a ity. lower bound of the solution of problem 95. Note that this property can be verified in Figure 46. Generalized Benders Decomposition Also, since function linearizations are accumu- (GBD) [244]. The GBD method [250] is lated as iterations proceed, the master problems similar to the outer-approximation method. 100 yield a nondecreasing sequence of lower The difference arises in the definition of Z1...≤Zk...Zk bounds L L L since linearizations are the MILP master problem 99. In the GBD accumulated as iterations k proceed. method only active inequalities are considered k k k The OA algorithm as proposed by Duran and J = {j| gj(x ,y )=0} and the set x ∈ X is Grossmann consists of performing a cycle of ma- disregarded. In particular, consider an outer- jor iterations, k = 1,..K, in which problem 97 is k k K approximation given at a given point (x , y ) solved for the corresponding y , and the relaxed   x−xk MILP master problem 100 is updated and solved T α≥f xk,yk +∇f xk,yk   with the corresponding function linearizations y−yk k k   at the point (x ,y ) for which the correspond- k (102) x−x ing subproblem NLP2 is solved. If feasible, the g xk,yk +∇g xk,yk T   ≤0 solution to that problem is used to construct the y−yk first MILP master problem; otherwise a feasi- k k bility problem 98 is solved to generate the cor- where for a fixed y the point x corresponds responding continuous point [247]. The initial to the optimal solution to problem 97. Making MILP master problem 100 then generates a new use of the Karush – Kuhn – Tucker conditions vector of discrete variables. The subproblems and eliminating the continuous variables x, the 97 yield an upper bound that is used to define inequalities in 102 can be reduced as follows k k [248]: the best current solution, UB =min ZU . The k    T   k k k k k cycle of iterations is continued until this upper α≥f x ,y +∇yf x ,y y−y +   bound and the lower bound of the relaxed mas-  T    T   k µk g xk,yk +∇ g xk,yk y−yk ter problem ZL are within a specified tolerance. y (103) One way to avoid solving the feasibility problem 98 in the OA algorithm when the discrete vari- which is the Lagrangian cut projected in the y- ables in problem 95 are 0–1, is to introduce the space. This can be interpreted as a surrogate con- following integer cut whose objective is to make straint of the equations in 102, because it is ob- tained as a linear combination of these. 116 Mathematics in Chemical Engineering

For the case when there is no feasible solu- added per iteration is equal to the number of non- tion to problem 97, then if the point xk is ob- linear constraints plus the nonlinear objective. tained from the feasibility subproblem (NLPF), If problem 95 has zero integrality gap, the the following feasibility cut projected in y can GBD algorithm converges in one iteration once be obtained using a similar procedure. the optimal (x*, y*) is found [252]. This prop-    T    T   k k k k k k erty implies that the only case in which one can λ g x ,y +∇yg x ,y y−y ≤0 expect the GBD method to terminate in one iter- (104) ation is that in which the initial discrete vector is the optimum, and when the objective value of the In this way, problem 99 reduces to a problem NLP relaxation of problem 95 is the same as the projected in the y-space: objective of the optimal mixed-integer solution. K Min ZL =α

k k k k T k st α ≥ f x ,y +∇yf x ,y y−y + Extended Cutting Plane (ECP) [249]. The   (105) ECP method, which is an extension of Kelley’s k T k k k k T k µ g x ,y +∇yg x ,y y−y cutting-plane algorithm for convex NLP [253], k∈KFS does not rely on the use of NLP subproblems   λk T g xk,yk +∇ g xk,yk T y−yk ≤0 and algorithms. It relies only on the iterative so- y lution of problem 99 by successively adding a k∈KIS linearization of the most violated constraint at k k x∈X,α∈R1 the predicted point (x , y ): # #  %% k ˆ k k where KFS is the set of feasible subproblems J = j∈arg max gj x ,y 97, and KIS the set of infeasible subproblems j∈J whose solution is given by problem 98. Also Convergence is achieved when the maximum |KFS ∪ KIS| = K. Since master problem 105 can constraint violation lies within the specified tol- be derived from master problem 100, in the con- erance. The optimal objective value of problem text of problem 95, GBD can be regarded as a 99 yields a nondecreasing sequence of lower particular case of the outer-approximation al- bounds. It is of course also possible to either gorithm. In fact one can prove that given the add to problem 99 linearizatons of all the vio- same set of K subproblems, the lower bound lated constraints in the set Jk , or linearizations predicted by the relaxed master problem 100 is of all the nonlinear constraints j ∈ J. In the ECP greater than or equal to that predicted by the re- method the objective must be defined as a linear laxed master problem 105 [245]. This proof fol- function, which can easily be accomplished by lows from the fact that the Lagrangian and fea- introducing a new variable to transfer nonlinear- sibility cuts 103 and 104 are surrogates of the ities in the objective as an inequality. outer-approximations 102. Given the fact that Note that since the discrete and continuous the lower bounds of GBD are generally weaker, variables are converged simultaneously, the ECP this method commonly requires a larger num- method may require a large number of itera- ber of cycles or major iterations. As the num- tions. However, this method shares with the OA ber of 0–1 variables increases this difference method Property 2 for the limiting case when all becomes more pronounced. This is to be ex- the functions are linear. pected since only one new cut is generated per iteration. Therefore, user-supplied constraints LP/NLP-Based Branch and Bound [248]. must often be added to the master problem to This method is similar in spirit to a branch-and- strengthen the bounds. Also, it is sometimes pos- cut method, and avoids the complete solution of sible to generate multiple cuts from the solution the MILP master problem (M-OA) at each major of an NLP subproblem in order to strengthen the iteration. The method starts by solving an initial lower bound [251]. As for the OA algorithm, NLP subproblem, which is linearized as in (M- the trade-off is that while it generally predicts OA). The basic idea consists then of performing stronger lower bounds than GBD, the computa- an LP-based branch-and-bound method for (M- tional cost for solving the master problem (M- OA) in which NLP subproblems 97 are solved at OA) is greater, since the number of constraints those nodes in which feasible integer solutions Mathematics in Chemical Engineering 117 are found. By updating the representation of the is obtained. This, however, comes at the price master problem in the current open nodes of the of having to solve an MIQP instead of an MILP tree with the addition of the corresponding lin- at each iteration. earizations, the need to restart the tree search is avoided. This method can also be applied to the GBD and ECP methods. The LP/NLP method com- monly reduces quite significantly the number of nodes to be enumerated. The trade-off, however, is that the number of NLP subproblems may in- crease. Computational experience has indicated that often the number of NLP subproblems re- mains unchanged. Therefore, this method is bet- ter suited for problems in which the bottleneck corresponds to the solution of the MILP master problem. Leyffer [254] has reported substantial savings with this method. Example: Consider the following MINLP problem, whose objective function and con- straints contain nonlinear convex terms

2 2 Min Z=y1+1.5y2+0.5y3+x1+x2 Figure 48. Progress of iterations with OA and GBD meth- 2 s.t. (x1−2) −x2≤0 ods, and number of subproblems for the BB, OA, GBD, and ECP methods.

x1−2y1≥0 The master problem 100 can involve a rather x −x −4(1−y ) ≥0 1 2 2 large number of constraints, due to the accumu- x2−y2≥0 (106) lation of linearizations. One option is to keep only the last linearization point, but this can x +x ≥3y 1 2 3 lead to nonconvergence even in convex prob- y1+y2+y3≥1 lems, since then the monotonic increase of the lower bound is not guaranteed. As shown [248], 0≤x ≤4,0≤x ≤4 1 2 linear approximations to the nonlinear objective y1,y2,y3=0,1 and constraints can be aggregated with an MILP master problem that is a hybrid of the GBD and The optimal solution of this problem is given by OA methods. y1 =0,y2 =1,y3 =0,x1 =1,x2 =1,Z = 3.5. Fig- For the case when linear equalities of the form ure 48 shows the progress of the iterations with h(x, y) = 0 are added to problem 95 there is no the OA and GBD methods, while the table lists major difficulty, since these are invariant to the the number of NLP and MILP subproblems that linearization points. If the equations are nonlin- are solved with each of the methods. For the case ear, however, there are two difficulties. First, it of the MILP problems the total number of LPs is not possible to enforce the linearized equali- solved at the branch-and-bound nodes are also ties at K points. Second, the nonlinear equations reported. may generally introduce nonconvexities, unless they relax as convex inequalities [255]. Kocis Extensions of MINLP Methods. Exten- and Grossmann [256] proposed an equality re- sions of the methods described above include laxation strategy in which the nonlinear equali- a quadratic approximation to (RM-OAF) [247] ties are replaced by the inequalities using an approximation of the Hessian matrix.   x−xk The quadratic approximations can help to re-  T k k k   duce the number of major iterations, since an T ∇h x ,y ≤0 (107) y−yk improved representation of the continuous space 118 Mathematics in Chemical Engineering     k k k k where T = tii , and tii=sign λi sign in lies in how to perform the branching on the dis- k which λi is the multiplier associated to the equa- crete and continuous variables. Some methods tion hi (x, y) = 0. Note that if these equations relax perform the spatial tree enumeration on both as the inequalities h(x, y) ≤ 0 for all y and h(x, the discrete and continuous variables of prob- y) is convex, this is a rigorous procedure. Other- lem 108. Other methods perform a spatial branch wise, nonvalid supports may be generated. Also, and bound on the continuous variables and solve in the master problem 105 of GBD, no special the corresponding MINLP problem 108 at each provision is required to handle equations, since node using any of the methods reviewed above. these are simply included in the Lagrangian cuts. Finally, other methods branch on the discrete However, similar difficulties as in OA arise if the variables of problem 108, and switch to a spa- equations do not relax as convex inequalities. tial branch and bound on nodes where a fea- When f (x,y) and g(x,y) are nonconvex in sible value for the discrete variables is found. problem 95, or when nonlinear equalities h(x, The methods also rely on procedures for tighten- y) = 0 are present, two difficulties arise. First, ing the lower and upper bounds of the variables, the NLP subproblems 96 – 98 may not have a since these have a great effect on the quality of unique local optimum solution. Second, the mas- the underestimators. Since the tree searches are ter problem (M-MIP) and its variants (e.g., M- not finite (except for ε convergence), these meth- MIPF, M-GBD, M-MIQP) do not guarantee a ods can be computationally expensive. However, K valid lower bound ZL or a valid bounding repre- their major advantage is that they can rigorously sentation with which the global optimum may be find the global optimum. Specific cases of non- cut off. convex MINLP problems have been handled. An Rigorous global optimization approaches for example is the work of Porn¨ and Westerlund addressing nonconvexities in MINLP problems [261], who addressed the solution of MINLP can be developed when special structures are problems with pseudoconvex objective function assumed in the continuous terms (e.g. bilinear, and convex inequalities through an extension of linear fractional, concave separable). Specifi- the ECP method. cally, the idea is to use convex envelopes or un- The other option for handling nonconvexities derestimators to formulate lower-bounding con- is to apply a heuristic strategy to try to reduce vex MINLP problems. These are then combined as much as possible the effect of nonconvexities. with global optimization techniques for contin- While not being rigorous, this requires much less uous variables [206, 207, 209, 214, 216, 222, computational effort. We describe here an ap- 257], which usually take the form of spatial proach for reducing the effect of nonconvexities branch-and-bound methods. The lower bound- at the level of the MILP master problem. ing MINLP problem has the general form, Viswanathan and Grossmann [262] proposed to introduce slacks in the MILP master prob- Min Z=f (x,y) lem to reduce the likelihood of cutting off feasi- s.t. gj (x,y) ≤0 j∈J (108) ble solutions. This master problem (augmented x∈X,y∈Y penalty/equality relaxation) has the form: K f,g K k k k k where are valid convex underestimators min Z =α+ wp p +wq q such that f (x,y) ≤f (x,y) and the inequalities k=1  s.t. α≥f xk,yk  g (x,y) ≤0 are satisfied if g (x,y) ≤0. A typical    x − xk   example of convex underestimators are the con- +∇f xk,yk T     vex envelopes for bilinear terms [208]. y − yk   Examples of global optimization methods    x− xk  T   for MINLP problems include the branch-and- T k∇h xk,yk   ≤pk k=1,...K α  reduce method [209, 210], the -BB method y − yk     [212], the reformulation/spatial branch-and- k  x− x  T  bound search method [258], the branch-and-cut g xk,yk +∇g xk,yk     method [259], and the disjunctive branch-and- y − yk  (109)  bound method [260]. All these methods rely on ≤qk a branch-and-bound procedure. The difference Mathematics in Chemical Engineering 119    k yi− yi≤ B  −1 k=1,...K i∈Bk i∈Nk Logic-Based Optimization. Given difficul- x∈X,y∈Y,α∈R1,pk,qk≥0 ties in the modeling and solution of mixed in- teger problems, the following major approaches k k where wp ,wq are weights that are chosen suffi- based on logic-based techniques have emerged: ciently large (e.g., 1000 times the magnitude of generalized disjunctive programming 110 [266], the Lagrange multiplier). Note that if the func- mixed-logic linear programming (MLLP) [267], tions are convex then the MILP master problem and constraint programming (CP) [268]. The 109 predicts rigorous lower bounds to problem motivations for this logic-based modeling has 95 since all the slacks are set to zero. been to facilitate the modeling, reduce the com- binatorial search effort, and improve the han- Computer Codes for MINLP. Computer dling of nonlinearities. In this section we mostly codes for solving MINLP problems include the concentrate on generalized disjunctive program- following. The program DICOPT [262] is an ming and provide a brief reference to constraint MINLP solver that is available in the modeling programming. A general review of logic-based system GAMS [263]. The code is based on the optimization can be found in [269, 309]. master problem 109 and the NLP subproblems Generalized disjunctive programming in 110 97. This code also uses relaxed 96 to generate the [266] is an extension of disjunctive program- first linearization for the above master problem, ming [270] that provides an alternative way of with which the user need not specify an initial modeling MILP and MINLP problems. The gen- integer value. Also, since bounding properties of eral formulation 110 is as follows: problem 109 cannot be guaranteed, the search  Min Z= c +f (x) for nonconvex problems is terminated when k∈K k there is no further improvement in the feasible s.t. g (x) ≤0 NLP subproblems. This is a heuristic that works   Yjk reasonably well in many problems. Codes that     implement the branch-and-bound method using ∨  hjk (x) ≤0  ,k∈K (110) j∈J   subproblems 96 include the code MINLP BB, k ck=γjk which is based on an SQP algorithm [243] and Ω (Y )=True is available in AMPL, the code BARON [264], which also implements global optimization ca- x∈Rn,c∈Rm,Y∈{true, false}m pabilities, and the code SBB, which is available where Y jk are the Boolean variables that decide in GAMS [263]. The code a-ECP implements ∈ the extended cutting-plane method [249], in- whether a term j in a disjunction k K is true or x cluding the extension by Porn¨ and Westerlund false, and are continuous variables. The objec- [261]. Finally, the code MINOPT [265] also im- tive function involves the term f (x) for the con- tinuous variables and the charges ck that depend plements the OA and GBD methods, and applies ∈ them to mixed-integer dynamic optimization on the discrete choices in each disjunction k K. The constraints g(x) ≤ 0 hold regardless of the problems. It is difficult to derive general con- ≤ clusions on the efficiency and reliability of all discrete choice, and hjk (x) 0 are conditional these codes and their corresponding methods, constraints that hold when Y jk is true in the j-th since no systematic comparison has been made. term of the k-th disjunction. The cost variables ck correspond to the fixed charges, and are equal However, one might anticipate that branch-and- γ Ω bound codes are likely to perform better if the to jk if the Boolean variable Y jk is true. (Y) relaxation of the MINLP is tight. Decomposi- are logical relations for the Boolean variables tion methods based on OA are likely to perform expressed as propositional logic. better if the NLP subproblems are relatively ex- Problem 110 can be reformulated as an pensive to solve, while GBD can perform with MINLP problem by replacing the Boolean vari- some efficiency if the MINLP is tight and there ables by binary variables yjk , are many discrete variables. ECP methods tend to perform well on mostly linear problems. 120 Mathematics in Chemical Engineering   Z= γjkyjk+f (x) Min k∈K j∈Jk λjk are the weight factors that determine the fea- s.t. g (x) ≤0 sibility of the disjunctive term. Note that when λjk is 1, then the j-th term in the k-th disjunc- hjk (x) ≤Mjk 1−yjk ,j∈Jk,k∈K (BM) tion is enforced and the other terms are ignored.  (111) jk y =1,k∈K The constraints λjkhjk v /λjk are convex if j∈Jk jk hjk (x) is convex [274, p. 160]. A formal proof Ay≤a can be found in [242]. Note that the convex hull U 0≤x≤x ,yjk∈{0,1} ,j∈Jk,k∈K 113 reduces to the result by Balas [275] if the constraints are linear. Based on the convex hull where the disjunctions are replaced by “Big- relaxation 113, Lee and Grossmann [273] pro- M” constraints which involve a parameter Mjk posed the following convex relaxation program and binary variables yjk . The propositional logic Ω of problem 110. statements (Y) = True are replaced by the lin-   ≤ ZL= γ λ +f (x) ear constraints Ay a [271] and [272]. Here we Min k∈K j∈Jk jk jk assume that x is a nonnegative variable with U s.t. g (x) ≤0 finite upper bound x . An important issue in   x= vjk, λ =1,k∈K ( ) model 111 is how to specify a valid value for the j∈Jk j∈Jk jk CRP Big-M parameter Mjk . If the value is too small, jk U 0≤x, v ≤x , 0≤λjk≤1,j∈Jk,k∈K (114) then feasible points may be cut off. If Mjk is too jk large, then the continuous relaxation might be λjkhjk v /λjk ≤0,j∈Jk,k∈K too loose and yield poor lower bounds. There- Aλ≤a fore, finding the smallest valid value for Mjk jk U is desired. For linear constraints, one can use 0≤x, v ≤x , 0≤λjk≤1,j∈Jk,k∈K the upper and lower bound of the variable x to U calculate the maximum value of each constraint, where x is a valid upper bound for x and v. Note which then can be used to calculate a valid value that the number of constraints and variables in- creases in problem 114 compared with problem of Mjk . For nonlinear constraints one can in prin- ciple maximize each constraint over the feasible 110. Problem 114 has a unique optimal solution region, which is a nontrivial calculation. and it yields a valid lower bound to the optimal solution of problem 110 [273]. Grossmann and Lee and Grossmann [273] have derived the convex hull relaxation of problem 110. The ba- Lee [276] proved that problem 114 has the use- sic idea is as follows. Consider a disjunction k ful property that the lower bound is greater than ∈ K that has convex constraints or equal to the lower bound predicted from the   relaxation of problem 111. Yjk Further description of algorithms for disjunc-   ∨  hjk (x) ≤0  tive programming can be found in [277]. j∈Jk c=γjk (112) 0≤x≤xU ,c≥0 Constraint Programming. Constraint pro- gramming (CP) [268, 269] is a relatively new where hjk (x) are assumed to be convex and modeling and solution paradigm that was orig- bounded over x. The convex hull relaxation of inally developed to solve feasibility problems, disjunction 112 [242] is given as follows: but it has been extended to solve optimiza-   jk tion problems as well. Constraint programming x= v ,c= λjkγjk j∈Jk j∈J is very expressive, as continuous, integer, and jk U 0≤v ≤λjkxjk,j∈Jk Boolean variables are permitted and, moreover,  variables can be indexed by other variables. Con- λ =1, 0≤λ ≤1,j∈J ( ) j∈Jk jk jk k CH (113) straints can be expressed in algebraic form (e.g., jk λjkhjk v /λjk ≤0,j∈Jk h(x) ≤ 0), as disjunctions (e.g., [A1x ≤ b1] ∨ A2x ≤ x,c,vjk≥0,j∈J b2]), or as conditional logic statements (e.g., If k g(x) ≤ 0 then r(x) ≤ 0). In addition, the language where vjk are disaggregated variables that are as- can support special implicit functions such as ... signed to each term of the disjunction k ∈ K, and the all-different (x1, x2, xn ) constraint for Mathematics in Chemical Engineering 121 assigning different values to the integer vari- discretization that converts the original continu- ables x1, x2, ...xn . The language consists of ous time problem into a discrete problem. Early C ++ procedures, although the recent trend has solution strategies, known as indirect methods, been to provide higher level languages such as were focused on solving the classical variational OPL. Other commercial CP software packages conditions for optimality. On the other hand, include ILOG Solver [278], CHIP [279], and methods that discretize the original continuous ECLiPSe [280]. time formulation can be divided into two cate- gories, according to the level of discretization. Here we distinguish between the methods that 10.6. Dynamic Optimization discretize only the control profiles (partial dis- cretization) and those that discretize the state and Interest in dynamic simulation and optimiza- control profiles (full discretization). Basically, tion of chemical processes has increased sig- the partially discretized problem can be solved nificantly during the last two decades. Chem- either by dynamic programming or by applying a ical processes are modeled dynamically using nonlinear programming (NLP) strategy (direct- differential-algebraic equations (DAEs), con- sequential). A basic characteristic of these meth- sisting of differential equations that describe the ods is that a feasible solution of the DAE system, dynamic behavior of the system, such as mass for given control values, is obtained by integra- and energy balances, and algebraic equations tion at every iteration of the NLP solver. The that ensure physical and thermodynamic rela- main advantage of these approaches is that, for tions. Typical applications include control and the NLP solver, they generate smaller discrete scheduling of batch processes; startup, upset, problems than full discretization methods. shut-down, and transient analysis; safety stud- Methods that fully discretize the continuous ies; and the evaluation of control schemes. We time problem also apply NLP strategies to solve state a general differential-algebraic optimiza- the discrete system and are known as direct- tion problem 115 as follows: simultaneous methods. These methods can use different NLP and discretization techniques, but Min Φ (z (tf );y (tf );u (tf );tf ;p) the basic characteristic is that they solve the DAE

s.t. F (dz/dt; z (t);u (t);t; p)=0,z(0) =z0 system only once, at the optimum. In addition, they have better stability properties than partial Gs [z (ts);y (ts);u (ts);ts; p)] =0 discretization methods, especially in the pres- zL≤z (t) ≤xU ence of unstable dynamic modes. On the other (115) L U hand, the discretized optimization problem is y ≤y (t) ≤y larger and requires large-scale NLP solvers, such uL≤u (t) ≤yU as SOCS, CONOPT, or IPOPT. With this classification we take into account pL≤p≤pU the degree of discretization used by the differ- t U tf ≤tf ≤tf ent methods. Below we briefly present the de- scription of the variational methods, followed Φ where is a scalar objective function at final by methods that partially discretize the dynamic time tf , and F are DAE constraints, Gs additional optimization problem, and finally we consider point conditions at times ts, z(t) differential state full discretization methods for problem 115. profile vectors, y(t) algebraic state profile vec- tors, u(t) control state profile vectors, and p is a Variational Methods. These methods are time-independent parameter vector. based on the solution of the first-order necessary We assume, without loss of generality, that conditions for optimality that are obtained from the index of the DAE system is one, consistent Pontryagin’s maximum principle [281, 282]. If initial conditions are available, and the objective we consider a version of problem 115 without function is in the above Mayer form. Otherwise, bounds, the optimality conditions are formulated it is easy to reformulate problems to this form. as a set of DAEs: Problem 115 can be solved either by the vari- ∂F (z,y,u,p,t) ∂H ∂F (z,y,u,p,t) λ = = λ (116a) ational approach or by applying some level of ∂z ∂z ∂z 122 Mathematics in Chemical Engineering

F (z,y,u,p,t)=0 (116b) iteration. This produces the value of the ob- jective function, which is used by a nonlinear Gf z,y,u,p,tf =0 (116c) programming solver to find the optimal pa- rameters in the control parametrization ν. The

Gs (z,y,u,p,ts)=0 (116d) sequential method is reliable when the system contains only stable modes. If this is not the ∂H ∂F (z,y,u,p,t) case, finding a feasible solution for a given set = λ=0 (116e) ∂y ∂y of control parameters can be very difficult. The time horizon is divided into time stages and at ∂H ∂F (z,y,u,p,t) = λ=0 (116f) each stage the control variables are represented ∂u ∂u with a piecewise constant, a piecewise linear, tf or a polynomial approximation [283, 284]. A ∂F (z,y,u,p,t) λdt=0 (116g) common practice is to represent the controls as ∂p 0 a set of Lagrange interpolation polynomials. For the NLP solver, gradients of the objec- where the Hamiltonian H is a scalar function of T λ λ tive and constraint functions with respect to the the form H(t)=F(z, y, u, p, y) (t) and (t)isa control parameters can be calculated with the vector of adjoint variables. Boundary and jump sensitivity equations of the DAE system, given conditions for the adjoint variables are given by: by: ∂F λ (t )+∂Φ + ∂Gf v =0 ∂z f ∂z ∂z f ∂F T ∂F T ∂F T ∂F T s + s + w + =0,     (117) ∂z k ∂z k ∂y k ∂q ∂F λ t− + ∂Gs v = ∂F λ t+ k ∂z s ∂z s ∂z s ∂z (0) sk (0) = k=1,...Nq (118) ∂q where vf , vs are the multipliers associated with k the final time and point constraints, respectively. ∂z(t) ∂y(t) T sk (t)= wk (t)= where ∂qk , ∂qk , and q = The most expensive step lies in obtaining a solu- T T tion to this boundary value problem. Normally, [p , ν ]. As can be inferred from Equation 118, the state variables are given as initial condi- the cost of obtaining these sensitivities is directly tions, and the adjoint variables as final condi- proportional to Nq, the number of decision vari- tions. This formulation leads to boundary value ables in the NLP. Alternately, gradients can be problems (BVPs) that can be solved by a num- obtained by integration of the adjoint Equations ber of standard methods including single shoot- 116a, 116e, 116g [282, 285, 286] at a cost in- ing, invariant embedding, multiple shooting, or dependent of the number of input variables and some discretization method such as collocation proportional to the number of constraints in the on finite elements or finite differences. Also the NLP. point conditions lead to an additional calcula- Methods that are based on this approach can- tion loop to determine the multipliers vf and vs. not treat directly the bounds on state variables, On the other hand, when bound constraints are because the state variables are not included in considered, the above conditions are augmented the nonlinear programming problem. Instead, with additional multipliers and associated com- most of the techniques for dealing with inequal- plementarity conditions. Solving the resulting ity path constraints rely on defining a measure system leads to a combinatorial problem that is of the constraint violation over the entire hori- prohibitively expensive except for small prob- zon, and then penalizing it in the objective func- lems. tion, or forcing it directly to zero through an end-point constraint [287]. Other techniques ap- Partial Discretization. With partial dis- proximate the constraint satisfaction (constraint cretization methods (also called sequential aggregation methods) by introducing an exact methods or control vector parametrization), only penalty function [286, 288] or a Kreisselmeier– the control variables are discretized. Given the Steinhauser function [288] into the problem. initial conditions and a given set of control Finally, initial value solvers that handle path parameters, the DAE system is solved with a constraints directly have been developed [284]. differential algebraic equation solver at each Mathematics in Chemical Engineering 123

The main idea is to use an algorithm for con- but path constraints for the states may not be strained dynamic simulation, so that any admis- satisfied between grid points. This problem can sible combination of the control parameters pro- be avoided by applying penalty techniques to duces an initial value problem that is feasible enforce feasibility, like the ones used in the se- with respect to the path constraints. The algo- quential methods. rithm proceeds by detecting activation and de- The resulting NLP is solved using SQP-type activation of the constraints during the solution, methods, as described above. At each SQP iter- and solving the resulting high-index DAE sys- ation, the DAEs are integrated in each stage and tem and their related sensitivities. objective and constraint gradients with respect to p, zi , and νi are obtained using sensitivity equa- Full Discretization. Full discretization tions, as in problem 118. Compared to sequen- methods explicitly discretize all the variables tial methods, the NLP contains many more vari- of the DAE system and generate a large-scale ables, but efficient decompositions have been nonlinear programming problem that is usually proposed [290] and many of these calculations solved with a successive quadratic programming can be performed in parallel. (SQP) algorithm. These methods follow a simul- In collocation methods, the continuous time taneous approach (or infeasible path approach); problem is transformed into an NLP by ap- that is, the DAE system is not solved at each iter- proximating the profiles as a family of poly- ation; it is only solved at the optimum point. Be- nomials on finite elements. Various polynomial cause of the size of the problem, special decom- representations are used in the literature, in- position strategies are used to solve the NLP ef- cluding Lagrange interpolation polynomials for ficiently. Despite this characteristic, the simul- the differential and algebraic profiles [291]. In taneous approach has advantages for problems [191] a Hermite–Simpson collocation form is with state variable (or path) constraints and for used, while Cuthrell and Biegler [292] and systems where instabilities occur for a range of Tanartkit and Biegler [293] use a monomial inputs. In addition, the simultaneous approach basis for the differential profiles. All of these re- can avoid intermediate solutions that may not presentations stem from implicit Runge–Kutta exist, are difficult to obtain, or require excessive formulae, and the monomial representation is computational effort. There are mainly two dif- recommended because of smaller condition ferent approaches to discretize the state variables numbers and smaller rounding errors. Control explicitly, multiple shooting [289, 290] and col- and algebraic profiles, on the other hand, are ap- location on finite elements [181, 191, 291]. proximated using Lagrange polynomials. With multiple shooting, time is discretized Discretizations of problem 115 using colloca- into P stages and control variables are tion formulations lead to the largest NLP prob- parametrized using a finite set of control parame- lems, but these can be solved efficiently using ters in each stage, as with partial discretization. large-scale NLP solvers such as IPOPT and by The DAE system is solved on each stage, i = exploiting the structure of the collocation equa- 1,...P and the values of the state variables z(ti ) tions. Biegler et al. [181] provide a review of are chosen as additional unknowns. In this way a dynamic optimization methods using simultane- set of relaxed, decoupled initial value problems ous methods. These methods offer a number of (IVP) is obtained: advantages for challenging dynamic optimiza- tion problems, which include: F (dz/dt; z (t);y (t);νi; p)=0, • Control variables can be discretized at the t∈ [ti−1,ti] ,z(ti−1)=zi (119) same level of accuracy as the differential and algebraic state variables. The KKT conditions z −z (t ; z ; ν ; p)=0,i=1,...P−1 i+1 i i i of the discretized problem can be shown to be Note that continuity among stages is treated consistent with the variational conditions of through equality constraints, so that the final so- problem 115. Finite elements allow for dis- lution satisfies the DAE system. With this ap- continuities in control profiles. proach, inequality constraints for states and con- • Collocation formulations allow problems trols can be imposed directly at the grid points, with unstable modes to be handled in an 124 Mathematics in Chemical Engineering

efficient and well-conditioned manner. The sified into two broad areas: optimization mod- NLP formulation inherits stability properties eling platforms and simulation platforms with of boundary value solvers. Moreover, an ele- optimization. mentwise decomposition has been developed Optimization modeling platforms provide that pins down unstable modes in problem general purpose interfaces for optimization al- 115. gorithms and remove the need for the user to • Collocation formulations have been proposed interface to the solver directly. These platforms with moving finite elements. This allows allow the general formulation for all problem the placement of elements both for accurate classes discussed above with direct interfaces breakpoint locations of control profiles as well to state of the art optimization codes. Three re- as accurate DAE solutions. presentative platforms are GAMS (General Al- gebraic Modeling Systems), AMPL (A Mathe- Dynamic optimization by collocation meth- matical Programming Language), and AIMMS ods has been used for a wide variety of pro- (Advanced Integrated Multidimensional Model- cess applications including batch process opti- ing Software). All three require problem-model mization, batch distillation, crystallization, dy- input via a declarative modeling language and namic data reconciliation and parameter estima- provide exact gradient and Hessian information tion, nonlinear model predictive control, poly- through automatic differentiation strategies. Al- mer grade transitions and process changeovers, though possible, these platforms were not de- and reactor design and synthesis. A review of signed to handle externally added procedural this approach can be found in [294]. models. As a result, these platforms are best ap- plied on optimization models that can be devel- oped entirely within their modeling framework. 10.7. Development of Optimization Nevertheless, these platforms are widely used Models for large-scale research and industrial applica- tions. In addition, the MATLABplatform allows The most important aspect of a successful op- the flexible formulation of optimization models timization study is the formulation of the op- as well, although it currently has only limited ca- timization model. These models must reflect pabilities for automatic differentiation and lim- the real-world problem so that meaningful op- ited optimization solvers. More information on timization results are obtained, and they also these and other modeling platforms can be found must satisfy the properties of the problem class. on the NEOS server www-neos.mcs.anl.gov For instance, NLPs addressed by gradient-based Simulation platforms with optimization are methods require functions that are defined in the often dedicated, application-specific modeling variable domain and have bounded and continu- tools to which optimization solvers have been ous first and second derivatives. In mixed integer interfaced. These lead to very useful optimiza- problems, proper formulations are also needed tion studies, but because they were not originally to yield good lower bounds for efficient search. designed for optimization models, they need to With increased understanding of optimization be used with some caution. In particular, most of methods and the development of efficient and these platforms do not provide exact derivatives reliable optimization codes, optimization prac- to the optimization solver; often they are approx- titioners now focus on the formulation of opti- imated through finite difference. In addition, the mization models that are realistic, well-posed, models themselves are constructed and calcu- and inexpensive to solve. Finally, convergence lated through numerical procedures, instead of properties of NLP, MILP, and MINLP solvers through an open declarative language. Examples require accurate first (and often second) deriva- of these include widely used process simulators tives from the optimization model. If these con- such as Aspen/Plus, PRO/II, and Hysys. More tain numerical errors (say, through finite differ- recent platforms such as Aspen Custom Mod- ence approximations) then performance of these eler and gPROMS include declarative models solvers can deteriorate considerably. As a result and exact derivatives. of these characteristics, modeling platforms are For optimization tools linked to procedural essential for the formulation task. These are clas- models, reliable and efficient automatic differ- Mathematics in Chemical Engineering 125 entiation tools are available that link to models 11.1. Concepts written, say, in FORTRAN and C, and calcu- late exact first (and often second) derivatives. Suppose N values of a variable y, called y1, y2,. Examples of these include ADIFOR, ADOL- ..,yN , might represent N measurements of the C, GRESS, Odyssee, and PADRE. When used same quantity. The arithmetic mean E (y)is with care, these can be applied to existing N procedural models and, when linked to mod- yi E (y)= i=1 ern NLP and MINLP algorithms, can lead to N powerful optimization capabilities. More infor- The median is the middle value (or average of mation on these and other automatic differ- the two middle values) when the set of numbers entiation tools can be found on: http://www- is arranged in increasing (or decreasing) order. unix.mcs.anl.gov/autodiff/AD Tools/. The geometric mean y¯G is Finally, the availability of automatic differ- 1/N entiation and related sensitivity tools for differ- yG =(y1y2...yN ) ential equation models allows for considerable The root-mean-square or quadratic mean is flexibility in the formulation of optimization models. In [295] a seven-level modeling hierar- * N − − = E (y2)= y2/N chy is proposed that matches optimization algo- Root mean square i rithms with models that range from completely i=1 open (fully declarative model) to fully closed The range of a set of numbers is the difference (entirely procedural without sensitivities). At the between the largest and the smallest members in lowest, fully procedural level, only derivative- the set. The mean deviation is the mean of the free optimization methods are applied, while the deviation from the mean. highest, declarative level allows the application N of an efficient large-scale solver that uses first |yi−E (y)| i=1 Mean−deviation = and second derivatives. Depending on the mod- N eling level, optimization solver performance can vary by several orders of magnitude. The variance is N  2 (yi−E (y)) var (y)=σ2 = i=1 11. Probability and Statistics N [15, 296 – 303] and the standard deviation σ is the square root of the variance. The models treated thus far have been determin- N (y −E (y))2 istic, that is, if the parameters are known the i σ = i=1 outcome is determined. In many situations, all N the factors cannot be controlled and the outcome may vary randomly about some average value. If the set of numbers {yi } is a small sample Then a range of outcomes has a certain proba- from a larger set, then the sample average bility of occurring, and statistical methods must n yi be used. This is especially true in quality control y = i=1 of production processes and experimental mea- n surements. This chapter presents standard sta- is used in calculating the sample variance tistical concepts, sampling theory and statistical n  2 decisions, and factorial design of experiments (yi−y) s2 = i=1 or analysis of variances. Multivariant linear and n−1 nonlinear regression is treated in Chapter 2. and the sample standard deviation n (y −y)2 i s = i=1 n−1 126 Mathematics in Chemical Engineering

The value n − 1 is used in the denominator be- error must be present and one error cannot pre- cause the deviations from the sample average dominate (unless it is normally distributed). The must total zero: normal distribution function is calculated eas- n ily; of more value are integrals of the function, (yi−y)=0 which are given in Table 12; the region of inter- i=1 est is illustrated in Figure 50. Thus, knowing n − 1 values of yi − y¯ and the fact that there are n values automatically gives the n-th value. Thus, only n − 1 degrees of free- dom ν exist. This occurs because the unknown mean E ( y) is replaced by the sample mean y derived from the data. If data are taken consecutively, running totals can be kept to permit calculation of the mean and variance without retaining all the data: n n n 2 2 2 (yi−y) = y1 −2y yi+(y) i=1 i=1 i=1 n  Figure 49. Frequency of occurrence of different scores y = yi/n i=1 Thus, n n 2 n, y1 , and yi i=1 i=1 are retained, and the mean and variance are com- puted when needed. Repeated observations that differ because of experimental error often vary about some cen- tral value in a roughly symmetrical distribution in which small deviations occur more frequently Figure 50. Area under normal curve than large deviations. In plotting the number of times a discrete event occurs, a typical curve is obtained, which is shown in Figure 49. Then the For a small sample, the variance can only 2 probability p of an event (score) occurring can be estimated with the sample variance s . Thus, be thought of as the ratio of the number of times the normal distribution cannot be used because it was observed divided by the total number of σ is not known. In such cases Student’s t- events. A continuous representation of this prob- distribution, shown in Figure 51 [303, p. 70], ability density function is given by the normal is used: distribution y y−E (y) p (y)= 0 ,t= √  n/2 1 −[y−E(y)]2/2σ2 2 s/ n p (y)= √ e . (120) 1+ t σ 2π n−1 This is called a normal probability distribution and y0 is chosen such that the area under the function. It is important because many results curve is one. The number ν = n − 1 is the de- are insensitive to deviations from a normal dis- grees of freedom, and as ν increases, Student’s t- tribution. Also, the central limit theorem says distribution approaches the normal distribution. that if an overall error is a linear combination of The normal distribution is adequate (rather than component errors, then the distribution of errors the t-distribution) when ν>15, except for the tends to be normal as the number of components tails of the curve which require larger ν. Inte- increases, almost regardless of the distribution of grals of the t-distribution are given in Table 13, the component errors (i.e., they need not be nor- the region of interest is shown in Figure 52. mally distributed). Naturally, several sources of Mathematics in Chemical Engineering 127

Table 12. Area under normal curve *

z 2 F (z)= √1 e−z /2dz 2π 0 zF(z) zF(z) 0.0 0.0000 1.5 0.4332 0.1 0.0398 1.6 0.4452 0.2 0.0793 1.7 0.4554 0.3 0.1179 1.8 0.4641 0.4 0.1554 1.9 0.4713 0.5 0.1915 2.0 0.4772 0.6 0.2257 2.1 0.4821 0.7 0.2580 2.2 0.4861 0.8 0.2881 2.3 0.4893 0.9 0.3159 2.4 0.4918 1.0 0.3413 2.5 0.4938 1.1 0.3643 2.7 0.4965 1.2 0.3849 3.0 0.4987 1.3 0.4032 4.0 0.499968 1.4 0.4192 5.0 0.4999997

* Table gives the probability F that a random variable will fall in the shaded region of Figure 50. For a more complete table (in slightly different form), see [23, Table 26.1]. This table is obtained in Microsoft Excel with the function NORMDIST(z,0,1,1)-0.5.

Table 13. Percentage points of area under Students t-distribution * να=0.10 α=0.05 α=0.01 α=0.001 1 6.314 12.706 63.657 636.619 2 2.920 4.303 9.925 31.598 3 2.353 3.182 5.841 12.941 4 2.132 2.776 4.604 8.610 5 2.015 2.571 4.032 6.859 6 1.943 2.447 3.707 5.959 7 1.895 2.365 3.499 5.408 8 1.860 2.306 3.355 5.041 9 1.833 2.262 3.250 4.781 10 1.812 2.228 3.169 4.587 15 1.753 2.131 2.947 4.073 20 1.725 2.086 2.845 3.850 25 1.708 2.060 2.787 3.725 30 1.697 2.042 2.750 3.646 ∞ 1.645 1.960 2.576 3.291

* Table gives t values such that a random variable will fall in the shaded region of Figure 52 with probability α. For a one-sided test the confidence limits are obtained for α/2. For a more complet table (in slightly different form), see [23, Table 26.10]. This table is obtained in Microsoft Excel with the function TINV(α,ν).

Other probability distribution functions are p (x) ≥0 useful. Any distribution function must satisfy the ∞ following conditions: p (x)dx =1 −∞ 0≤F (x) ≤1 The Bernoulli distribution applies when the F (−∞)=0,F (+∞)=1 outcome can take only two values, such as heads or tails, or 0 or 1. The probability distribution F (x) ≤F (y) when x≤y function is

k 1−k The probability density function is p (x = k)=p (1−p) ,k=0or 1 dF (x) p (x)= dx and the mean of a function g (x) depending on x is where E [g (x)] = g (1) p+g (0) (1−p) dF = pdx is the probability of x being between x and x + dx. The probability density function satisfies 128 Mathematics in Chemical Engineering

Figure 51. Student’s t-distribution. For explanation of ν see text

one kind and N − M are of another kind. Then the objects are drawn one by one, without re- placing the last draw. If the last draw had been replaced the distribution would be the binomial distribution. If x is the number of objects of type M drawn in a sample of size n, then the proba- bility of x=kis p (x = k)=

M!(N−M)!n!(N−n)! k!(M−k)!(n−k)!(N−M−n+k)!N! Figure 52. Percentage points of area under Student’s t-distribution The mean and variance are E (x)= nM N

The binomial distribution function applies when var (x)=np (1−p) N−n N−1 there are n trials of a Bernoulli event; it gives the probability of k occurrences of the event, which The Poisson distribution is occurs with probability p on each trial λk p (x = k)=e−λ n! k! p (x = k)= pk(1−p)n−k k (n−k)! with a parameter λ. The mean and variance are The mean and variance are E (x)=λ E (x)=np var (x)=λ var (x)=np (1−p) The simplest continuous distribution is the The hypergeometric distribution function ap- uniform distribution. The probability density plies when there are N objects, of which M are of function is Mathematics in Chemical Engineering 129

if they are statistically dependent, and

1 ab if they are statistically independent. If a set of and the probability distribution function is variables yA, yB,...isindependent and identi-   0 x

var (x) = exp σ2−1 exp 2µ+σ2 relation coefficient is [15, p. 484] n (yAi−yA)(yBi−yB ) r (y ,y )= i=1 A B (n−1) s s 11.2. Sampling and Statistical Decisions A B If measurements are for independent, identi- Two variables can be statistically dependent or cally distributed observations, the errors are in- independent. For example, the height and diam- dependent and uncorrelated. Then y varies about eter of all distillation towers are statistically in- E ( y) with variance σ2/n, where n is the number dependent, because the distribution of diameters of observations in y. Thus if something is mea- of all columns 10 m high is different from that sured several times today and every day, and the of columns 30 m high. If yB is the diameter and measurements have the same distribution, the yA the height, the distribution is written as variance of the means decreases with the num- ber of samples in each day’s measurement n.Of p (yB|yA = constant) , or here course, other factors (weather, weekends) may

p (yB|yA = 10) =p (yB|yA = 30) cause the observations on different days to be distributed nonidentically. A third variable yC, could be the age of the oper- Suppose Y, which is the sum or difference of ator on the third shift. This variable is probably two variables, is of interest: unrelated to the diameter of the column, and for the distribution of ages is Y = yA±yB p (yC|yA)=p (yC) Then the mean value of Y is E (Y )=E (y ) ±E (y ) Thus, variables yA and yC are distributed inde- A B pendently. The joint distribution for two vari- and the variance of Y is ables is 2 2 2 σ (Y )=σ (yA)+σ (yB) p (yA,yB)=p (yA) p (yB|yA) 130 Mathematics in Chemical Engineering

More generally, consider the random variables mean of some property could be measured be- y1, y2,...with means E ( y1), E ( y2),...and fore and after the change; if these differ, does 2 2 variances σ ( y1), σ ( y2),...andcorrelation it mean the process modification caused it, or coefficients ij . The variable could the change have happened by chance? This is a statistical decision. A hypothesis H0 is de- Y = α1y1+α2y2+... fined; if it is true, action A must be taken. The has a mean reverse hypothesis is H1; if this is true, action B must be taken. A correct decision is made if E (Y )=α1E (y1)+α2E (y2)+... action A is taken when H0 is true or action B is taken when H1 is true. Taking action B when and variance [303, p. 87] H0 is true is called a type I error, whereas taking n 2 2 2 action A when H1 is true is called a type II error. σ (Y )= αi σ (yi) i=1 The following test of hypothesis or test of n n significance must be defined to determine if the +2 αiαj σ (yi) σ (yj ) ?ij i=1j=i+1 hypothesis is true. The level of significance is or the maximum probability that an error would be accepted in the decision (i.e., rejecting the hy- n 2 2 2 pothesis when it is actually true). Common lev- σ (Y )= αi σ (yi) i=1 els of significance are 0.05 and 0.01, and the test (120) n n of significance can be either one or two sided. If +2 αiαj Cov (yi,yj ) i=1j=i+1 a sampled distribution is normal, then the prob- ability that the z score If the variables are uncorrelated and have the y−y same variance, then z = ( ) sy n 2 2 2 σ (Y )= αi σ i=1 This fact can be used to obtain more accu- rate cost estimates for the purchased cost of a chemical plant than is true for any one piece of equipment. Suppose the plant is composed of a number of heat exchangers, pumps, towers, etc., and that the cost estimate of each device is ± 40 % of its cost (the sample standard devia- Figure 53. Two-sided statistical decision tion is 20 % of its cost). In this case the αi are the numbers of each type of unit. Under special is in the unshaded region is 0.95. Because a two- conditions, such as equal numbers of all types sided test is desired, F = 0.95/2 = 0.475. The of units and comparable cost, the standard devi- value given in Table 12 for F = 0.475 is z = 1.96. ation of the plant costs is If the test was one-sided, at the 5 % level of σ significance, 0.95 = 0.5 (for negative z) + F (for σ (Y )= √ n positive z). Thus, F = 0.45 or z = 1.645. In the √ two-sided test (see Fig. 53), if a single sample ± (40/ n)% and is then . Thus the standard de- is chosen and z <− 1.96 or z > 1.96, then this viation of the cost for the entire plant is the stan- could happen with probability 0.05 if the hy- dard deviation of each piece of equipment di- pothesis were true. This z would be significantly vided by the square root of the number of units. different from the expected value (based on the Under less restrictive conditions the actual num- chosen level of significance) and the tendency bers change according to the above equations, would be to reject the hypothesis. If the value of but the principle is the same. z was between − 1.96 and 1.96, the hypothesis Suppose modifications are introduced into would be accepted. the manufacturing process. To determine if the The same type of decisions can be made for modification causes a significant change, the other distributions. Consider Student’s t-distri- Mathematics in Chemical Engineering 131 ν N N bution. At a 95 % level of confidence, with = 1 2 2 2 ± (xi−x) + (yi−y)   10 degrees of freedom, the t values are 2.228. i=1 i=1 1 1 sD = + Thus, the sample mean would be expected to be N1+N2−2 N1 N2 between s Next, compute the value of t y±tc √ x−y n t = sD with 95 % confidence. If the mean were outside this interval, the hypothesis would be rejected. and evaluate the significance of t using Student’s The chi-square distribution is useful for ex- t-distribution for N1+ N2− 2 degrees of free- amining the variance or standard deviation. The dom. statistic is defined as If the samples have different variances, the

2 relevant statistic for the t-test is χ2 = ns σ2 x−y t =  2 2 2 var (x) /N +var (y) /N = (y1−y) +(y2−y) +...+(yn−y) 1 2 σ2 The number of degrees of freedom is now taken and the chi-square distribution is approximately as ν−2 −χ2/2 p (y)=y0χ e  2 var(x) + var(y) N N ν = 1 2 ν = n − 1 is the number of degrees of free- 2 2 [var(x)/N1] + [var(y)/N2] dom and y0 is chosen so that the integral of p ( N1−1 N2−1 y) over all y is 1. The probability of a deviation There is also an F-test to decide if two dis- χ2 larger than is given in Table 14; the area in tributions have significantly different variances. question, in Figure 54. For example, for 10 de- In this case, the ratio of variances is calculated: grees of freedom and a 95 % confidence level, 2 var (x) the critical values of χ are 0.025 and 0.975. F = Then var (y) √ √ s n s n where the variance of x is assumed to be larger. <σ< χ0.975 χ0.025 Then, a table of values is used to determine the significance of the ratio. The table [23, Table or √ √ 26.9] is derived from the formula [15, p. 169] s n s n  ν ν  <σ< Q (F |ν ,ν )=I 2 , 1 20.5 3.25 1 2 ν2/(ν2+ν1F ) 2 2 with 95 % confidence. where the right-hand side is an incomplete beta Tests are available to decide if two distribu- function. The F table is given by the Microsoft tions that have the same variance have differ- Excel function FINV(fraction, νx , νy ), where ent means [15, p. 465]. Let one distribution be fraction is the fractional value (≤ 1) representing called x, with N1 samples, and the other be called the upper percentage and νx and νy are the de- y, with N2 samples. First, compute the standard grees of freedom of the numerator and denomi- error of the difference of the means: nator, respectively. Example. For two sample variances with 8 de- grees of freedom each, what limits will bracket their ratio with a midarea probability of 90 %? FINV(0.95,8,8) = 3.44. The 0.95 is used to get both sides to toal 10 %. Then

P [1/3.44≤var (x) /var (y) ≤3.44] = 0.90.

Figure 54. Percentage points of area under chi-squared distribution with ν degrees of freedom 132 Mathematics in Chemical Engineering

Table 14. Percentage points of area under chi-square distribution with ν degrees of freedom * να=0.995 α=0.99 α=0.975 α=0.95 α=0.5 α=0.05 α=0.025 α=0.01 α=0.005 1 7.88 6.63 5.02 3.84 0.455 0.0039 0.0010 0.0002 0.00004 2 10.6 9.21 7.38 5.99 1.39 0.103 0.0506 0.0201 0.0100 3 12.8 11.3 9.35 7.81 2.37 0.352 0.216 0.115 0.072 4 14.9 13.3 11.1 9.49 3.36 0.711 0.484 0.297 0.207 5 16.7 15.1 12.8 11.1 4.35 1.15 0.831 0.554 0.412 6 18.5 16.8 14.4 12.6 5.35 1.64 1.24 0.872 0.676 7 20.3 18.5 16.0 14.1 6.35 2.17 1.69 1.24 0.989 8 22.0 20.1 17.5 15.5 7.34 2.73 2.18 1.65 1.34 9 23.6 21.7 19.0 16.9 8.34 3.33 2.70 2.09 1.73 10 25.2 23.2 20.5 18.3 9.34 3.94 3.25 2.56 2.16 12 28.3 26.2 23.3 21.0 11.3 5.23 4.40 3.57 3.07 15 32.8 30.6 27.5 25.0 14.3 7.26 6.26 5.23 4.60 17 35.7 33.4 30.2 27.6 16.3 8.67 7.56 6.41 5.70 20 40.0 37.6 34.2 31.4 19.3 10.9 9.59 8.26 7.43 25 46.9 44.3 40.6 37.7 24.3 14.6 13.1 11.5 10.5 30 53.7 50.9 47.0 43.8 29.3 18.5 16.8 15.0 13.8 40 66.8 63.7 59.3 55.8 39.3 26.5 24.4 22.2 20.7 50 79.5 76.2 71.4 67.5 49.3 34.8 32.4 29.7 28.0 60 92.0 88.4 83.3 79.1 59.3 43.2 40.5 37.5 35.5 70 104.2 100.4 95.0 90.5 69.3 51.7 48.8 45.4 43.3 80 116.3 112.3 106.6 101.9 79.3 60.4 57.2 53.5 51.2 90 128.3 124.1 118.1 113.1 89.3 69.1 65.6 61.8 59.2 100 140.2 135.8 129.6 124.3 99.3 77.9 74.2 70.1 67.3

2 2 2 * Table value is χα; χ <χα with probability α. For a more complete table (in slightly different form), see [23, Table 26.8]. The Microsoft Excel function CHIINV(1-α,ν) gives the table value.

11.3. Error Analysis in Experiments seemingly identical conditions and several dif- ferent values are obtained, with means E (y1), E 2 2 Suppose a measurement of several quantities is (y2),...andvariances σ1, σ2,...Nextsuppose made and a formula or mathematical model is the errors are small and independent of one an- used to deduce some property of interest. For other. Then a change in Y is related to changes example, to measure the thermal conductivity in yi by of a solid k, the heat flux q, the thickness of the ∂Y ∂Y dY = dy1+ dy2+... sample d, and the temperature difference across ∂y1 ∂y2 ∆ the sample T must be measured. Each mea- If the changes are indeed small, the partial surement has some error. The heat. flux q may be derivatives are constant among all the samples. the rate of electrical heat input Q divided by the Then the expected value of the change is   area A, and both quantities are measured to some N ∂Y E (dY )= E (dy ) tolerance. The thickness of the sample is mea- ∂y i sured with some accuracy, and the temperatures i=1 i are probably measured with a thermocouple, to Naturally E (dyi ) = 0 by definition so that E (dY some accuracy. These measurements are com- ) = 0, too. However, since the errors are indepen- bined, however, to obtain the thermal conduc- dent of each other and the partial derivatives are tivity, and the error in the thermal conductivity assumed constant because the errors are small, must be determined. The formula is the variances are given by Equation 121 [296, p. d . 550] k = Q N  2 A∆T 2 ∂Y 2 σ (dY )= σ (121) ∂y i If each measured quantity has some variance, i=1 i what is the variance in the thermal conductiv- Thus, the variance of the desired quantity Y can ity? be found. This gives an independent estimate Suppose a model for Y depends on various of the errors in measuring the quantity Y from measurable quantities, y1, y2,...Suppose sev- the errors in measuring each variable it depends eral measurements are made of y1, y2,...under upon. Mathematics in Chemical Engineering 133

11.4. Factorial Design of Experiments pling analysis discussed in Section 11.2 can be and Analysis of Variance used to deduce if the means of the two treatments differ significantly. With more treatments, the Statistically designed experiments consider, of analysis is more detailed. Suppose the experi- course, the effect of primary variables, but they mental results are arranged as shown in Table 15, also consider the effect of extraneous variables, i.e., several measurements for each treatment. the interactions among variables, and a measure The objective is to see if the treatments differ sig- of the random error. Primary variables are those nificantly from each other, that is, whether their whose effect must be determined. These vari- means are different. The samples are assumed to ables can be quantitative or qualitative. Quanti- have the same variance. The hypothesis is that tative variables are ones that may be fit to a model the treatments are all the same, and the null hy- to determine the model parameters. Curve fitting pothesis is that they are different. Deducing the of this type is discused in Chapter 2. Qualita- statistical validity of the hypothesis is done by tive variables are ones whose effect needs to be an analysis of variance. known; no attempt is made to quantify that ef- fect other than to assign possible errors or mag- Table 15. Estimating the effect of four treatments nitudes. Qualitative variables can be further sub- Treatment 1234 divided into type I variables, whose effect is de- −−−− termined directly, and type II variables, which −−−− contribute to performance variability, and whose −−−− effect is averaged out. For example, in studying −−− −− the effect of several catalysts on yield in a chem- − ical reactor, each different type of catalyst would be a type I variable, because its effect should be Treatment average, yt −−−− known. However, each time the catalyst is pre- Grand average, y pared, the results are slightly different, because − of random variations; thus, several batches may exist of what purports to be the same catalyst. The data for k = 4 treatments are arranged in The variability between batches is a type II vari- Table 15. Each treatment has nt experiments, able. Because the ultimate use will require us- and the outcome of the i-th experiment with ing different batches, the overall effect including treatment t is called yti . The treatment average that variation should be known, because know- is ing the results from one batch of one catalyst nt precisely might not be representative of the re- yti i=1 sults obtained from all batches of the same cat- yt = alyst. A randomized block design, incomplete nt block design, or Latin square design, for exam- and the grand average is ple, all keep the effect of experimental error in k the blocked variables from influencing the ef- ntyt k fect of the primary variables. Other uncontrolled y = t=1 ,N= n N t variables are accounted for by introducing ran- t=1 domization in parts of the experimental design. Next, the sum of squares of deviations is com- To study all variables and their interaction re- puted from the average within the t-th treatment quires a factorial design, involving all possible nt combinations of each variable, or a fractional 2 St = (yti−yt) factorial design, involving only a selected set. i=1 Statistical techniques are then used to determine the important variables, the important interac- Since each treatment has nt experiments, the − tions and the error in estimating these effects. number of degrees of freedom is nt 1. Then The discussion here is a brief overview of [303]. the sample variances are

If only two methods exist for preparing some 2 St st = product, to see which treatment is best, the sam- nt−1 134 Mathematics in Chemical Engineering

The within-treatment sum of squares is Name Formula Degrees of freedom 2 k Average SA = nky 1 n S = S 2 R t Blocks SB = k (yt−y) n-1 t=1 i=1 k 2 Treatments ST = n (yt−y) k-1 and the within-treatment sample variance is t=1 k n S = (y −y −y +y)2 2 SR Residuals R ti i t (n-1)(k-1) sR = t=1i=1 N−k k n 2 Total S = yti N=nk Now, if no difference exists between treatments, t=1i=1 σ2 a second estimate of could be obtained by The key test is again a statistical one, based calculating the variation of the treatment aver- on the value of ages about the grand average. Thus, the between- S s2 /s2 , where s2 = T treatment mean square is computed: T R T k−1 2 SR k and s = S R (n−1)(k−1) s2 = T ,S = n (y −y)2 T k−1 T t t and the F distribution for νR and νT degrees of t=1 freedom [303, p. 636]. The assumption behind Basically the test for whether the hypothesis is the analysis is that the variations are linear [303, true or not hinges on a comparison between the p. 218]. Ways to test this assumption as well as 2 within-treatment estimate sR (with νR=N− k transformations to make if it is not true are pro- degrees of freedom) and the between-treatment vided in [303], where an example is given of how 2 estimate sT (with νT =k− 1 degrees of free- the observations are broken down into a grand dom). The test is made based on the F distri- average, a block deviation, a treatment deviation, bution for νR and νT degrees of freedom [23, and a residual. For two-way factorial design, in Table 26.9], [303, p. 636]. which the second variable is a real one rather than one you would like to block out, see [303, p. 228]. Table 16. Block design with four treatments and five blocks

Treatment 1234Block average Table 17. Two-level factorial design with three variables Block 1 −−−−− Run Variable Variable Variable Block 2 −−−−− 1 2 3 Block 3 −−−−− −−− Block 4 −−−−− 1 −− Block 5 −−−−− 2+ 3 − + − 4++− Treatment −−−−grand 5 −−+ average average 6+− + 7 − ++ 8 +++ Next consider the case in which randomized blocking is used to eliminate the effect of some variable whose effect is of no interest, such as To measure the effects of variables on a single the batch-to-batch variation of the catalysts in outcome, a factorial design is appropriate. In a the chemical reactor example. With k treatments two-level factorial design, each variable is con- and n experiments in each treatment, the results sidered at two levels only, a high and low value, from nkexperiments can be arranged as shown often designated as a + and a −. The two-level in Table 16; within each block, various treat- factorial design is useful for indicating trends ments are applied in a random order. The block and showing interactions; it is also the basis for average, the treatment average, and the grand a fractional factorial design. As an example, con- average are computed as before. The following sider a 23 factorial design, with 3 variables and quantities are also computed for the analysis of 2 levels for each. The experiments are indicated variance table: in Table 17. The main effects are calculated by determining the difference between results from Mathematics in Chemical Engineering 135 all high values of a variable and all low values of 12. Applied a variable; the result is divided by the number of to Thermodynamics experiments at each level. For example, for the first variable, calculate Many of the functional relationships required in thermodynamics are direct applications of Effect of variable 1=[(y2+y4+y6+y8) the rules of multivariable calculus. In this short − [(y +y +y +y )]] /4 1 3 5 7 chapter, those rules are reviewed in the context Note that all observations are being used to sup- of the needs of thermodynamics. These ideas ply information on each of the main effects and were expounded in one of the classic books on each effect is determined with the precision of a chemical engineering thermodynamics [299]. fourfold replicated difference. The advantage of a one-at-a-time experiment is the gain in preci- sion if the variables are additive and the measure 12.1. State Functions of nonadditivity if it occurs [303, p. 313]. Interaction effects between variables 1 and 2 State functions depend only on the state of the are obtained by comparing the difference bet- system, not on its past history or how one got ween the results obtained with the high and low there. If z is a function of two variables x and value of 1 at the low value of 2 with the differ- y, then z (x, y) is a , because z is ence between the results obtained with the high known once x and y are specified. The differen- and low value 1 at the high value of 2. The 12- tial of z is interaction is dz = Mdx+Ndy 12 interaction =[(y4−y3+y8−y7) The line integral − [(y2−y1+y6−y5)]] /2 The key step is to determine the errors associ- (Mdx+Ndy) ated with the effect of each variable and each C interaction so that the significance can be de- is independent of the path in x–yspace if and termined. Thus, standard errors need to be as- only if signed. This can be done by repeating the exper- ∂M ∂N iments, but it can also be done by using higher = (122) ∂y ∂x order interactions (such as 123 interactions in a 24 factorial design). These are assumed negligi- Because the total differential can be written as     ble in their effect on the mean but can be used ∂z ∂z dz = dx+ dy (123) to estimate the standard error [303, pp. 319 – ∂x ∂y 328]. Then calculated effects that are large com- y x pared to the standard error are considered impor- for path independence     tant, whereas those that are small compared to ∂ ∂z ∂ ∂z = the standard error are considered due to random ∂y ∂x ∂x ∂y variations and are unimportant. y x In a fractional factorial design, only part of or the possible experiments is performed. With k ∂2z ∂2z k = (124) variables, a factorial design requires 2 exper- ∂y∂x ∂x∂y iments. When k is large, the number of exper- iments can be large; for k =5,25 = 32. For k is needed. this large, Box et al. [296, p. 235] do a frac- Various relationships can be derived from tional factorial design. In the fractional factorial Equation 123. If z is constant, &     ' design with k = 5, only 8 experiments are cho- ∂z ∂z Cropley 0= dx+ dy sen. [298] gives an example of how to ∂x ∂y combine heuristics and statistical arguments in y x z application to kinetics mechanisms in chemical Rearrangement gives engineering. 136 Mathematics in Chemical Engineering       ∂z ∂y ∂z (∂y/∂x) = − = − z (125) which here is ∂x y ∂x z ∂y x (∂y/∂z)x     ∂T ∂p = − (129) Alternatively, if Equation 123 is divided by dy ∂V ∂S while some other variable w is held constant, S V         ∂z ∂z ∂x ∂z This is one of the Maxwell relations and is = + (126) ∂y ∂x ∂y ∂y merely an expression of Equation 124. w y w x The differentials of the other energies are Dividing both the numerator and the denomina- tor of a by dw while holding a dH = T dS+V dp (130) variable y constant yields       dA = −SdT −pdV (131) ∂z (∂z/∂w)y ∂z ∂w = = (127) ∂x y (∂x/∂w)y ∂w y ∂x y dG = −S dT +V dp (132) In thermodynamics the state functions in- clude the U, the enthalpy H, and From these differentials, other Maxwell rela- the Helmholtz and Gibbs free energies A and G, tions can be derived in a similar fashion by ap- respectively, which are defined as follows: plying Equation 124.     H = U+pV ∂T ∂V = (133) ∂p S ∂S p A = U−TS     ∂S ∂p = (134) G = H−TS = U+pV −TS = A+pV ∂V T ∂T V     where S is the entropy, T the absolute temper- ∂S ∂V = − (135) ature, p the pressure, and V the volume. These ∂p T ∂T p are also state functions, in that the entropy is specified once two variables (e.g., T and p) are The heat capacity at constant pressure is de- specified. Likewise V is specified once T and p fined as   are specified, and so forth. ∂H Cp = ∂T p 12.2. Applications to Thermodynamics If entropy and enthalpy are taken as functions of T and p, the total differentials are     All of the following applications are for closed ∂S ∂S dS = dT + dp systems with constant mass. If a process is re- ∂T p ∂p T     versible and only p–Vwork is done, one form ∂H ∂H of the first law states that changes in the internal dH = dT + dp ∂T p ∂p T energy are given by the following expression   ∂H = CpdT + dp dU = T dS−pdV (128) ∂p T If the internal energy is considered a function of If the pressure is constant, S and V, then       ∂S ∂U ∂U dS = dT and dH = CpdT dU = dS+ dV ∂T p ∂S V ∂V S When enthalpy is considered a function of S and This is the equivalent of Equation 123 and     p, the total differential is ∂U ∂U T = ,p= − dH = T dS+V dp ∂S V ∂V S Because the internal energy is a state function, When the pressure is constant, this is Equation 124 is required: dH = T dS ∂2U ∂2U = ∂V ∂S ∂S∂V Mathematics in Chemical Engineering 137

Thus, at constant pressure Type 1 (3 possibilities plus reciprocals).       ∂S ∂a ∂p dH = C dT = T dS = T dT : , : p ∂T General Specific p ∂b c ∂T V which gives Equation 125 yields         ∂S C ∂p ∂V ∂p = p = − ∂T p T ∂T V ∂T p ∂V T

(∂V/∂T )p When p is not constant, using the last Maxwell = − (141) (∂V/∂p) relation gives T   Cp ∂V This relates all three partial derivatives of this dS = dT − dp (136) type. T ∂T p Then the total differential for H is Type 2 (30 possibilities plus reciprocals).       ∂V ∂α ∂G dH = T dS+V dp = CpdT −T dp+V dp General: , Specific: ∂T p ∂b c ∂T V Rearranging this, when H (T, p), yields Using Equation 132 gives     &   ' ∂G ∂p ∂V = −S+V dH = CpdT + V −T dp (137) ∂T V ∂T V ∂T p Using the other equations for U, H, A,orS gives This equation can be used to evaluate enthalpy the other possibilities. differences by using information on the equation of state and the heat capacity: Type 3 (15 possibilities plus reciprocals).     T2 ∂a ∂V H (T2,p2) −H (T1,p1)= Cp (T,p1)dT General: , Specific: T ∂b α ∂T S   1 (138) p2   ∂V + V −T |T ,pdp First the derivative is expanded by using Equa- ∂T p 2 p1 tion 125, which is called expansion without in- troducing a new variable: The same manipulations can be done for in-       ∂V ∂S ∂V (∂S/∂T ) ternal energy: = − = − V   ∂T S ∂T V ∂S T (∂S/∂V )T ∂S Cv = (139) ∂T V T Then the numerator and denominator are evalu- & ' ated as type 2 derivatives, or by using Equations (∂V/∂T )p Cv dS = − dV + dT (140) (99) and (100): (∂V/∂p)T T   ∂V Cv/T & ' = − ∂T −(∂V/∂T ) (∂p/∂V ) (∂V/∂T )p S p T dU = CvdT − p+T dV   ∂V (∂V/∂p)T Cv ∂p =  T (142) T ∂V ∂T p 12.3. Partial Derivatives of All These derivatives are important for reversible, Thermodynamic Functions adiabatic processes (e.g., in an ideal turbine or compressor) because the entropy is constant. The various partial derivatives of the thermody- Similar derivatives can be obtained for isen- namic functions can be classified into six groups. thalpic processes, such as a pressure reduction In the general formulas below, the variables U, at a valve. In that case, the Joule – Thomson co- H, A, G,orS are denoted by Greek letters, efficient is obtained for constant H:   &   ' whereas the variables V, T,orp are denoted by ∂T 1 ∂V Latin letters. = −V +T ∂p H Cp ∂T p 138 Mathematics in Chemical Engineering

Type 4 (30 possibilities plus reciprocals). To evaluate the derivative, Equation 126 is used     ∂α ∂G to express dS in terms of p and T: General: , Specific:       ∂β ∂A ∂S ∂V ∂p C c p = − + p ∂T V ∂T p ∂T V T Now, expand through the introduction of a new variable using Equation 127: Substitution for (∂p/∂T)V and rearrangement       ∂G ∂G ∂T (∂G/∂T ) give = = p     ∂A ∂T ∂A (∂A/∂T ) ∂V ∂p p p p p Cp−Cv = T ∂T p ∂T V This operation has created two type 2 deriva-     ∂V 2 ∂p tives. Substitution yields = −T   ∂T p ∂V T ∂G S = ∂A p S+p(∂V/∂T )p Use of this equation permits the rearrangement of Equation 142 into   C ∂V (∂V/∂T )2+ p (∂V/∂p) Type 5 (60 possibilities plus reciprocals). = p T T     ∂T (∂V/∂T ) ∂α ∂G S p General: , Specific: ∂b β ∂p A The ratio of heat capacities is C T (∂S/∂T ) Starting from Equation 132 for dG gives p = p     C T (∂S/∂T ) ∂G ∂T v V = −S +V ∂p A ∂p A Expansion by using Equation 125 gives C −(∂p/∂T ) (∂S/∂p) The derivative is a type 3 derivative and can be p = S T evaluated by using Equation 125. Cv −(∂V/∂T )S (∂S/∂V )T   ∂G (∂A/∂p) = S T +V and the ratios are then ∂p (∂A/∂T )     A p C ∂p ∂V p = C ∂V ∂p The two type 2 derivatives are then evaluated: v S T   ∂G Sp(∂V/∂p)T Using Equation 125 gives = +V       ∂p A S+p(∂V/∂T ) C ∂p ∂T ∂V p p = − C ∂V ∂p ∂T These derivatives are also of interest for free ex- v S V p pansions or isentropic changes. Entropy is a variable in at least one of the partial derivatives. Type 6 (30 possibilities plus reciprocals).     ∂α ∂G General: , Specific: ∂β γ ∂A H Equation 127 is used to obtain two type 5 deriva- 13. References tives.   ∂G (∂G/∂T ) Specific References = H ∂A (∂A/∂T ) 1. D. E. Seborg, T. F. Edgar, D. A. Mellichamp: H H Process Dynamics and Control, 2nd ed., John These can then be evaluated by using the proce- Wiley & Sons, New York 2004. dures for type 5 derivatives. 2. G. Forsyth, C. B. Moler: Computer Solution of The difference in molar heat capacities (Cp – Lineart Algebraic Systems, Prentice-Hall, Cv ) can be derived in similar fashion. Using Englewood Cliffs 1967. Equation 139 for Cv yields 3. B. A. Finlayson: Nonlinear Analysis in   Chemical Engineering, McGraw-Hill, New ∂S C = T York 1980; reprinted, Ravenna Park, Seattle v ∂T V 2003. Mathematics in Chemical Engineering 139

4. S. C. Eisenstat, M. H. Schultz, A. H. Sherman: Systems and Convection-Diffusion,” Comp. “Algorithms and Data Structures for Sparse Meth. Appl. Mech. 22 (1980) 23 – 48. Symmetric Gaussian Elimination,” SIAM J. 23. M. Abranowitz, I. A. Stegun: Handbook of Sci. Stat. Comput. 2 (1981) 225 – 237. Mathematical Functions, National Bureau of 5. I. S. Duff: Direct Methods for Sparse Standards, Washington, D.C. 1972. Matrices, Charendon Press, Oxford 1986. 24. J. C. Daubisse: “Some Results about 6. H. S. Price, K. H. Coats, “Direct Methods in Approximation of Functions of One or Two Reservoir Simulation,” Soc. Pet. Eng. J. 14 Variables by Sums of Exponentials, ” Int. J. (1974) 295 – 308. Num. Meth. Eng. 23 (1986) 1959 – 1967. 7. A. Bykat: “A Note on an Element Re-Ordering 25. O. C. McGehee: An Introduction to Complex Scheme,” Int. J. Num. Methods Egn. 11 (1977) Analysis, John Wiley & Sons, New York 2000. 194 –198. 26. H. A. Priestley: Introduction to complex 8. M. R. Hesteness, E. Stiefel: “Methods of analysis, Oxford University Press, New York conjugate gradients for solving linear 2003. systems,” J. Res. Nat. Bur. Stand 29 (1952) 27. Y. K. Kwok: Applied complex variables for 409 – 439. scientists and engineers, Cambridge 9. Y. Saad: Iterative Methods for Sparse Linear University Press, New York 2002. Systems, 2nd ed., Soc. Ind. Appl. Math., 28. N. Asmar, G. C. Jones: Applied complex Philadelphia 2003. analysis with partial differential equations, 10. Y. Saad, M. Schultz: “GMRES: A Generalized Prentice Hall, Upper Saddle River, NJ, 2002. Minimal Residual Algorithm for Solving 29. M. J. Ablowitz, A. S. Fokas: Complex Nonsymmetric Linear Systems.” SIAM J. Sci. variables: Introduction and applications, Statist. Comput. 7 (1986) 856–869. Cambridge University Press, New York 2003. 11. http://mathworld.wolfram.com/ 30. J. W. Brown, R. V. Churchill: Complex GeneralizedMinimalResidualMethod.html. variables and applications, 6th ed., 12. http://www.netlib.org/linalg/html templates/ McGraw-Hill, New York 1996; 7th ed. 2003. Templates.html. 31. W. Kaplan: Advanced calculus, 5th ed., 13. http://software.sandia.gov/. Addison-Wesley, Redwood City, Calif., 2003. 14. E. Isaacson, H. B. Keller, Analysis of 32. E. Hille: Analytic Function Theory, Ginn and Numerical Methods, J. Wiley and Sons, New Co., Boston 1959. York 1966. 33. R. V. Churchill: Operational Mathematics, 15. W. H. Press, B. P. Flannery, S. A. Teukolsky, McGraw-Hill, New York 1958. W. T. Vetterling: Numerical Recipes, 34. J. W. Brown, R. V. Churchill: Fourier Series Cambridge University Press, Cambridge 1986. and Boundary Value Problems, 6th ed., 16. R. W. H. Sargent: “A Review of Methods for McGraw-Hill, New York 2000. Solving Non-linear Algebraic Equations,” in 35. R. V. Churchill: Operational Mathematics, 3rd R. S. H. Mah, W. D. Seider (eds.): ed., McGraw-Hill, New York 1972. Foundations of Computer-Aided Chemical 36. B. Davies: Integral Transforms and Their Process Design, American Institute of Applications, 3rd ed., Springer, Heidelberg Chemical Engineers, New York 1981. 2002. 17. J. D. Seader: “Computer Modeling of 37. D. G. Duffy: Transform Methods for Solving Chemical Processes,” AIChE Monogr. Ser. 81 Partial Differential Equations, Chapman & (1985) no. 15. Hall/CRC, New York 2004. 18. http://software.sandia.gov/trilinos/packages/ 38. A. Varma, M. Morbidelli: Mathematical nox/loca user.html Methods in Chemical Engineering, Oxford, 19. N. R. Amundson: Mathematical Methods in New York 1997. Chemical Engineering, Prentice-Hall, 39. Bateman, H., Tables of Integral Transforms, Englewood Cliffs, N.J. 1966. vol. I, McGraw-Hill, New York 1954. 20. R. H. Perry, D. W Green: Perry’s Chemical 40. H. F. Weinberger: A First Course in Partial Engineers’ Handbook, 7th ed., McGraw-Hill, Differential Equations, Blaisdell, Waltham, New York 1997. Mass. 1965. 21. D. S. Watkins: “Understanding the QR 41. R. V. Churchill: Operational Mathematics, Algorithm,” SIAM Rev. 24 (1982) 427 – 440. 22. G. F. Carey, K. Sepehrnoori: “Gershgorin McGraw-Hill, New York 1958. Theory for Stiffness and Stability of Evolution 140 Mathematics in Chemical Engineering

42. J. T. Hsu, J. S. Dranoff: “Numerical Inversion 59. W. F. Ramirez: Computational Methods for of Certain Laplace Transforms by the Direct Process Simulations, 2nd ed., Application of Fast Fourier Transform (FFT) Butterworth-Heinemann, Boston 1997. Algorithm,” Comput. Chem. Eng. 11 (1987) 60. U. M. Ascher, L. R. Petzold: Computer 101 – 110. methods for ordinary differential equations 43. E. Kreyzig: Advanced Engineering and differential-algebraic equations, SIAM, Mathematics, 9th ed., John Wiley & Sons, Philadelphia 1998. New York 2006. 61. K. E. Brenan, S. L. Campbell, L. R. Petzold: 44. H. S. Carslaw, J. C. Jaeger: Conduction of Numerical Solution of Initial-Value Problems Heat in Solids, 2nd ed., Clarendon Press, in Differential-Algebraic Equations, Elsevier, Oxford – London 1959. Amsterdam 1989. 45. R. B. Bird, R. C. Armstrong, O. Hassager: 62. C. C. Pontelides, D. Gritsis, K. R. Morison, R. Dynamics of Polymeric Liquids, 2nd ed., W. H. Sargent: “The Mathematical Modelling Appendix A, Wiley-Interscience, New York of Transient Systems Using 1987. Differential-Algebraic Equations,” Comput. 46. R. S. Rivlin, J. Rat. Mech. Anal. 4 (1955) 681 Chem. Eng. 12 (1988) 449 – 454. – 702. 63. G. A. Byrne, P. R. Ponzi: 47. N. R. Amundson: Mathematical Methods in “Differential-Algebraic Systems, Their Chemical Engineering; Matrices and Their Applications and Solutions,” Comput. Chem. Application, Prentice-Hall, Englewood Cliffs, Eng. 12 (1988) 377 – 382. N.J. 1966. 64. L. F. Shampine, M. W. Reichelt: The 48. I. S. Sokolnikoff, E. S. Sokolnikoff: Higher MATLAB ODE Suite, SIAM J. Sci. Comp. 18 Mathematics for Engineers and Physicists, (1997) 122. McGraw-Hill, New York 1941. 65. M. Kubicek, M. Marek: Computational 49. R. C. Wrede, M. R. Spiegel: Schaum’s outline Methods in Bifurcation Theory and Dissipative of theory and problems of advanced calculus, Structures, Springer Verlag, Berlin – 2nd ed, McGraw Hill, New York 2006. Heidelberg – New York – Tokyo 1983. 50. P. M. Morse, H. Feshbach: Methods of 66. T. F. C. Chan, H. B. Keller: “Arc-Length Theoretical Physics, McGraw-Hill, New York Continuation and Multi-Grid Techniques for 1953. Nonlinear Elliptic Eigenvalue Problems,” 51. B. A. Finlayson: The Method of Weighted SIAM J. Sci. Stat. Comput. 3 (1982) 173 – 194. Residuals and Variational Principles, 67. M. F. Doherty, J. M. Ottino: “Chaos in Academic Press, New York 1972. Deterministic Systems: Strange Attractors, 52. G. Forsythe, M. Malcolm, C. Moler: Computer Turbulence and Applications in Chemical Methods for Mathematical Computation, Engineering,” Chem. Eng. Sci. 43 (1988) 139 Prentice-Hall, Englewood Cliffs, N.J. 1977. – 183. 53. J. D. Lambert: Computational Methods in 68. M. P. Allen, D. J. Tildesley: Computer Ordinary Differential Equations, J. Wiley and Simulation of Liquids, Clarendon Press, Sons, New York 1973. Oxford 1989. 54. N. R. Amundson: Mathematical Methods in 69. D. Frenkel, B. Smit: Understanding Molecular Chemical Engineering, Prentice-Hall, Simulation, Academic Press, San Diego 2002. Englewood Cliffs, N.J. 1966. 70. J. M. Haile: Molecular Dynamics Simulation, 55. J. R. Rice: Numerical Methods, Software, and John Wiley & Sons, New York 1992. Analysis, McGraw-Hill, New York 1983. 71. A. R. Leach: Molecular Modelling: Principles 56. M. B. Bogacki, K. Alejski, J. Szymanowski: and Applications, Prentice Hall, Englewood “The Fast Method of the Solution of Reacting Cliffs, NJ, 2001. Distillation Problem,” Comput. Chem. Eng. 13 72. T. Schlick: Molecular Modeling and (1989) 1081 – 1085. Simulations, Springer, New York 2002. 57. C. W. Gear: Numerical Initial-Value Problems 73. R. B. Bird, W. E. Stewart, E. N. Lightfoot: in Ordinary Differential Equations, Transport Phenomena, 2nd ed., John Wiley & Prentice-Hall, Englewood Cliffs, N.J. 1971. Sons, New York 2002. 58. N. B. Ferguson, B. A. Finlayson: “Transient 74. R. B. Bird, R. C. Armstrong, O. Hassager: Modeling of a Catalytic Converter to Reduce Dynamics of Polymeric Liquids, 2nd ed., Nitric Oxide in Automobile Exhaust,” AIChE Wiley-Interscience, New York 1987. J 20 (1974) 539 – 550. Mathematics in Chemical Engineering 141

75. P. V. Danckwerts: “Continuous Flow 89. J. Wang, R. G. Anthony, A. Akgerman: Systems,” Chem. Eng. Sci. 2 (1953)1–13. “Mathematical simulations of the performance 76. J. F. Wehner, R. Wilhelm: “Boundary of trickle bed and slurry reactors for methanol Conditions of Flow Reactor,” Chem. Eng. Sci. synthesis,” Comp. Chem. Eng. 29 (2005) 6 (1956) 89 – 93. 2474–2484. 77. V. Hlavacek,´ H. Hofmann: “Modeling of 90. V. K. C. Lee, J. F. Porter, G. McKay, A. P. Chemical Reactors-XVI-Steady-State Axial Mathews: “Application of solid-phase Heat and Mass Transfer in Tubular Reactors. concentration-dependent HSDM to the acid An Analysis of the Uniqueness of Solutions,” dye adsorption system,” AIChE J. 51 (2005) Chem. Eng. Sci. 25 (1970) 173 – 185. 323–332. 78. B. A. Finlayson: Numerical Methods for 91. C. deBoor, B. Swartz: “Collocation at Problems with Moving Fronts, Ravenna Park Gaussian Points,” SIAM J. Num. Anal. 10 Publishing Inc., Seattle 1990. (1973) 582 – 606. 79. E. Isaacson, H. B. Keller: Analysis of 92. R. F. Sincovec: “On the Solution of the Numerical Methods, J. Wiley and Sons, New Equations Arising From Collocation With York 1966. Cubic B-Splines,” Math. Comp. 26 (1972) 893 80. C. Lanczos: “Trigonometric Interpolation of – 895. Empirical and Analytical Functions,” J. Math. 93. U. Ascher, J. Christiansen, R. D. Russell: “A Phys. (Cambridge Mass.) 17 (1938) 123 – 199. Collocation Solver for Mixed-Order Systems 81. C. Lanczos: Applied Analysis, Prentice-Hall, of Boundary-Value Problems,” Math. Comp. Englewood Cliffs, N.J. 1956. 33 (1979) 659 – 679. 82. J. Villadsen, W. E. Stewart: “Solution of 94. R. D. Russell, J. Christiansen: “Adaptive Mesh Boundary-Value Problems by Orthogonal Selection Strategies for Solving Collocation,” Chem. Eng. Sci. 22 (1967) 1483 Boundary-Value Problems,” SIAM J. Num. – 1501. Anal. 15 (1978) 59 – 80. 83. M. L. Michelsen, J. Villadsen: “Polynomial 95. P. G. Ciarlet, M. H. Schultz, R. S. Varga: Solution of Differential Equations” pp. 341 – “Nonlinear Boundary-Value Problems I. One 368 in R. S. H. Mah, W. D. Seider (eds.): Dimensional Problem,” Num. Math. 9 (1967) Foundations of Computer-Aided Chemical 394 – 430. Process Design, Engineering Foundation, New 96. W. F. Ames: Numerical Methods for Partial York 1981. Differential Equations, 2nd ed., Academic 84. W. E. Stewart, K. L. Levien, M. Morari: Press, New York 1977. “Collocation Methods in Distillation,” in A. W. 97. J. F. Botha, G. F. Pinder: Fundamental Westerberg, H. H. Chien (eds.): Proceedings Concepts in The Numerical Solution of of the Second Int. Conf. on Foundations of Differential Equations, Wiley-Interscience, Computer-Aided Process Design, Computer New York 1983. Aids for Chemical Engineering Education 98. W. H. Press, B. P. Flanner, S. A. Teukolsky, W. (CACHE), Austin Texas, 1984, pp. 535 – 569. T. Vetterling: Numerical Recipes, Cambridge 85. W. E. Stewart, K. L. Levien, M. Morari: Univ. Press, Cambridge 1986. “Simulation of Fractionation by Orthogonal 99. H. Schlichting, Boundary Layer Theory, 4th Collocation,” Chem. Eng. Sci. 40 (1985) 409 – ed. McGraw-Hill, New York 1960. 421. 100. R. Aris, N. R. Amundson: Mathematical 86. C. L. E. Swartz, W. E. Stewart: Methods in Chemical Engineering, vol. 2, “Finite-Element Steady State Simulation of First-Order Partial Differential Equations with Multiphase Distillation,” AIChE J. 33 (1987) Applications, Prentice-Hall, Englewood Cliffs, 1977 – 1985. NJ, 1973. 87. K. Alhumaizi, R. Henda, M. Soliman: 101. R. Courant, D. Hilbert: Methods of “Numerical Analysis of a Mathematical Physics, vol. I and II, reaction-diffusion-convection system,” Comp. Intersicence, New York 1953, 1962. Chem. Eng. 27 (2003) 579–594. 102. P. M. Morse, H. Feshbach: Methods of 88. E. F. Costa, P. L. C. Lage, E. C. Biscaia, Jr.: Theoretical Physics, vol. I and II, “On the numerical solution and optimization McGraw-Hill, New York 1953. of styrene polymerization in tubular reactors,” 103. A. D. Polyanin: Handbook of Linear Partial Comp. Chem. Eng. 27 (2003) 1591–1604. Differential Equations for Engineers and 142 Mathematics in Chemical Engineering

Scientists, Chapman and Hall/CRC, Boca 120. M. D. Gunzburger: Finite Element Methods for Raton, FL 2002. Viscous Incompressible Flows, Academic 104. D. Ramkrishna, N. R. Amundson: Linear Press, San Diego 1989. Operator Methods in Chemical Engineering 121. H. Kardestuncer, D. H. Norrie: Finite Element with Applications to Transport and Chemical Handbook, McGraw-Hill, New York 1987. Reaction Systems, Prentice Hall, Englewood 122. J. N. Reddy, D. K. Gartling: The Finite Cliffs, NJ, 1985. Element Method in Heat Transfer and Fluid 105. D. D. Joseph, M. Renardy, J. C. Saut: Dynamics, 2nd ed., CRC Press, Boca Raton, “Hyperbolicity and Change of Type in the FL 2000. Flow of Viscoelastic Fluids,” Arch. Rational 123. O. C. Zienkiewicz, R. L. Taylor, J. Z. Zhu: The Mech. Anal. 87 (1985) 213 – 251. Finite Element Method: Its Basis & 106. H. K. Rhee, R. Aris, N. R. Amundson: Fundamentals, vol. 1, 6th ed., Elsevier First-Order Partial Differential Equations, Butterworth-Heinemann, Burlington, MA Prentice-Hall, Englewood Cliffs, N.J. 1986. 2005. 107. B. A. Finlayson: Numerical Methods for 124. O. C. Zienkiewicz, R. L. Taylor: The Finite Problems with Moving Fronts, Ravenna Park Element Method, Solid and Structural Publishing Inc., Seattle 1990. Mechanics, vol. 2, 5th ed., 108. D. L. Book: Finite-Difference Techniques for Butterworth-Heinemann, Burlington, MA Vectorized Fluid Dynamics Calculations, 2000. Springer Verlag, Berlin – Heidelberg – New 125. O. C. Zienkiewicz, R. L. Taylor: The Finite York – Tokyo 1981. Element Method, Fluid Dynamics, vol. 3, 5th 109. G. A. Sod: Numerical Methods in Fluid ed., Butterworth-Heinemann, Burlington, MA Dynamics, Cambridge University Press, 2000. Cambridge 1985. 126. Z. C. Li: Combined Methods for Elliptic 110. G. H. Xiu, J. L. Soares, P. Li, A. E. Rodriques: Equations with Singularities, Interfaces and “Simulation of five-step one-bed Infinities, Kluwer Academic Publishers, sorption-endanced reaction process,” AIChE J. Boston, MA, 1998. 48 (2002) 2817–2832. 127. G. J. Fix, S. Gulati, G. I. Wakoff: “On the use 111. A. Malek, S. Farooq: “Study of a six-bed of singular functions with finite element pressure swing adsorption process,” AIChE J. approximations,” J. Comp. Phys. 13 (1973) 43 (1997) 2509–2523. 209–238. 112. R. J. LeVeque: Numerical Methods for 128. N. M. Wigley: “On a method to subtract off a Conservation Laws, Birkhauser,¨ Basel 1992. singularity at a corner for the Dirichlet or 113. W. F. Ames: “Recent Developments in the Neumann problem,” Math. Comp. 23 (1968) Nonlinear Equations of Transport Processes,” 395–401. Ind. Eng. Chem. Fundam. 8 (1969) 522 – 536. 129. H. Y. Hu, Z. C. Li, A. H. D. Cheng: “Radial 114. W. F. Ames: Nonlinear Partial Differential Basis Collocation Methods for Elliptic Equations in Engineering, Academic Press, Boundary Value Problems,” Comp. Math. New York 1965. Applic. 50 (2005) 289–320. 115. W. E. Schiesser: The Numerical Method of 130. S. J. Osher, R. P. Fedkiw:, Level Set Methods Lines, Academic Press, San Diego 1991. and Dynamic Implicit Surfaces, Springer, New 116. D. Gottlieb, S. A. Orszag: Numerical Analysis York 2002. of Spectral Methods: Theory and Applications, 131. M. W. Chang, B. A. Finlayson: “On the Proper SIAM, Philadelphia, PA 1977. Boundary Condition for the Thermal Entry 117. D. W. Peaceman: Fundamentals of Numerical Problem,” Int. J. Num. Methods Eng. 15 Reservoir Simulation, Elsevier, Amsterdam (1980) 935 – 942. 1977. 132. R. Peyret, T. D. Taylor: Computational 118. G. Juncu, R. Mihail: “Multigrid Solution of Methods for Fluid Flow, Springer Verlag, the Diffusion-Convection-Reaction Equations Berlin – Heidelberg – New York – Tokyo which Describe the Mass and/or Heat Transfer 1983. Comput. from or to a Spherical Particle,” 133. P. M. Gresho, S. T. Chan, C. Upson, R. L. Lee: Chem. Eng. 13 (1989) 259 – 270. “A Modified Finite Element Method for 119. G. R. Buchanan: Schaum’s Outline of Finite Solving the Time-Dependent, Incompressible Element Analysis,” McGraw-Hill, New York Navier – Stokes Equations,” Int. J. Num. 1995. Mathematics in Chemical Engineering 143

Method. Fluids (1984) “Part 1. Theory”: 557 – 150. L. M. Delves, J. Walsh (eds.): Numerical 589; “Part 2: Applications”: 619 – 640. Solution of Integral Equations, Clarendon 134. P. M. Gresho, R. L. Sani: Incompressible Flow Press, Oxford 1974. and the Finite Element Method, 151. P. Linz: Analytical and Numerical Methods for Advection-Diffusion, vol. 1, John Wiley & Volterra Equations, SIAM Publications, Sons, New York 1998. Philadelphia 1985. 135. P. M. Gresho, R. L. Sani: Incompressible Flow 152. M. A. Golberg (ed.): Numerical Solution of and the Finite Element Method , Isothermal Integral Equations, Plenum Press, New York Laminar Flow, vol. 2, John Wiley & Sons, 1990. New York 1998. 153. C. Corduneanu: Integral Equations and 136. J. A. Sethian, Level Set Methods, Cambridge Applications, Cambridge Univ. Press, University Press, Cambridge 1996. Cambridge 1991. 137. C. C. Lin, H. Lee, T. Lee, L. J. Weber: “A level 154. R. Kress: Linear Integral Equations, 2nd ed. set characteristic Galerkin finite element Springer, Heidelberg 1999. method for free surface flows,” Int. J. Num. 155. P. K. Kythe P. Purl: Computational Methods Methods Fluids 49 (2005) 521–547. for Linear Integral Equations, Birkhauser,¨ 138. S. Succi: The Lattice Boltzmann Equation for Basel 2002. Fluid Dynamics and Beyond, Oxford 156. G. Nagel, G. Kluge: “Non-Isothermal University Press, Oxford 2001. Multicomponent Adsorption Processes and 139. M. C. Sukop, D. T. Thorne, Jr.: Lattice Their Numerical Treatment by Means of Boltzmann Modeling: An Introduction for Integro-Differential Equations,” Comput. Geoscientists and Engineers, Springer, New Chem. Eng. 13 (1989) 1025 – 1030. York 2006. 157. P. L. Mills, S. Lai, M. P. Dudukovic,´ P. A. 140. S. Chen, S. G. D. Doolen: “Lattice Boltzmann Ramachandran: “A Numerical Study of method for fluid flows,” Annu. Rev. Fluid Approximation Methods for Solution of Linear Mech. 30 (2001) 329–364. and Nonlinear Diffusion-Reaction Equations 141. H. T. Lau: Numerical Library in C for with Discontinuous Boundary Conditions,” Scientists and Engineers, CRC Press, Boca Comput. Chem. Eng. 12 (1988) 37 – 53. Raton, FL 1994. 158. D. Duffy: Green’s Functions with 142. Y. Y. Al-Jaymany, G. Brenner, P. O. Brum: Applications, Chapman and Hall/CRC, New “Comparative study of lattice-Boltzmann and York 2001. finite volume methods for the simulation of 159. I. Statgold: Green’s Functions and Boundary laminar flow through a 4:1 contraction,” Int. J. Value Problems, 2nd ed., Interscience, New Num. Methods Fluids 46 (2004) 903–920. York 1997. 143. R. L. Burden, R. L J. D. Faires, A. C. 160. B. Davies: Integral Transforms and Their Reynolds: Numerical Analysis, 8th ed., Brooks Applications, 3rd ed., Springer, New York Cole, Pacific Grove, CA 2005. 2002. 144. S. C. Chapra, R. P. Canal: Numerical Methods 161. J. M. Bownds, “Theory and performance of a for Engineers,” 5th ed., McGraw-Hill, New subroutine for solving Volterra integral York 2006. equations,” J. Comput. 28 317–332 (1982). 145. H. T. Lau: Numerical Library in Java for 162. J. G. Blom, H. Brunner, “Algorithm 689; Scientists and Engineers, CRC Press, Boca discretized collocation and iterated collocation Raton, FL 2004. for nonlinear Volterra integral equations of the 146. K. W. Morton, D. F. Mayers: Numerical second kind,” ACM Trans. Math. Software 17 Solution of Partial Differential Equations, (1991) 167–177. Cambridge University Press, New York 1994. 163. N. B. Ferguson, B. A. Finlayson: “Error 147. A. Quarteroni, A. Valli: Numerical Bounds for Approximate Solutions to Approximation of Partial Differential Nonlinear Ordinary Differential Equations,” Equations, Springer, Heidelberg 1997. AIChE J. 18 (1972) 1053 – 1059. 148. F. Scheid: “Schaum’s Outline of Numerical 164. R. S. Dixit, L. L. Taularidis: “Integral Method Analysis,” 2nd ed., McGraw-Hill, New York of Analysis of Fischer – Tropsch Synthesis 1989. Reactions in a Catalyst Pellet,” Chem. Eng. 149. C. T. H. Baker: The Numerical Treatment of Sci. 37 (1982) 539 – 544. Integral Equations, Clarendon Press, Oxford 1977. 144 Mathematics in Chemical Engineering

165. L. Lapidus, G. F. Pinder: Numerical Solution OptControlCentre,” Comp. Chem. Eng. 27 of Partial Differential Equations in Science (2003) no. 11, 1513–1531. and Engineering, Wiley-Interscience, New 181. L. T. Biegler, A. M. Cervantes, A. Wachter:¨ York 1982. “Advances in Simultaneous Strategies for 166. C. K. Hsieh, H. Shang: “A Boundary Dynamic Process Optimization,” Chemical Condition Dissection Method for the Solution Engineering Science 57 (2002) no. 4, 575–593. of Boundary-Value Heat Conduction Problems 182. C. D. Laird, L. T. Biegler, B. van Bloemen with Position-Dependent Convective Waanders, R. A. Bartlett: “Time Dependent Coefficients,” Num. Heat Trans. Part B: Fund. Contaminant Source Determination for 16 (1989) 245 – 255. Municipal Water Networks Using Large Scale 167. C. A. Brebbia, J. Dominguez: Boundary Optimization,” ASCE Journal of Water Elements – An Introductory Course, 2nd ed., Resource Management and Planning 131 Computational Mechanics Publications, (2005) no. 2, 125. Southhamtpon 1992. 183. A. R. Conn, N. Gould, P. Toint: Trust Region 168. J. Mackerle, C. A. Brebbia (eds.): Boundary Methods, SIAM, Philadelphia 2000. Element Reference Book, Springer Verlag, 184. A. Drud: “CONOPT – A Large Scale GRG Berlin – Heidelberg – New York – Tokyo 1988. Code,” ORSA Journal on Computing 6 (1994) 169. O. L. Mangasarian: Nonlinear Programming, 207–216. McGraw-Hill, New York 1969. 185. B. A. Murtagh, M. A. Saunders: MINOS 5.1 170. G. B. Dantzig: Linear Programming and User’s Guide, Technical Report SOL 83-20R, Extensions, Princeton University Press, Stanford University, Palo Alto 1987. Princeton, N.J, 1963. 186. R. H. Byrd, M. E. Hribar, J. Nocedal: An 171. S. J. Wright: Primal-Dual Interior Point Interior Point Algorithm for Large Scale Methods, SIAM, Philadelphia 1996. Nonlinear Programming, SIAM J. Opt. 9 172. F. Hillier, G. J. Lieberman: Introduction to (1999) no. 4, 877. Operations Research, Holden-Day, San 187. R. Fletcher, N. I. M. Gould, S. Leyffer, Ph. L. Francisco, 1974. Toint, A. Waechter: “Global Convergence of a 173. T. F. Edgar, D. M. Himmelblau, L. S. Lasdon: Trust-region (SQP)-filter Algorithms for Optimization of Chemical Processes, General Nonlinear Programming,” SIAM J. McGraw-Hill Inc., New York 2002. Opt. 13 (2002) no. 3, 635–659. 174. K. Schittkowski: More Test Examples for 188. R. J. Vanderbei, D. F. Shanno: An Interior Nonlinear Programming Codes, Lecture notes Point Algorithm for Non-convex Nonlinear in economics and mathematical systems no. Programming. Technical Report SOR-97-21, 282, Springer-Verlag, Berlin 1987. CEOR, Princeton University, Princeton, NJ, 175. J. Nocedal, S. J. Wright: Numerical 1997. Optimization, Springer, New York 1999. 189. P. E. Gill, W. Murray, M. Wright: Practical 176. R. Fletcher: Practical Optimization, John Optimization, Academic Press, New York Wiley & Sons, Ltd., Chichester, UK1987 1981. 177. A. Wachter,¨ L. T. Biegler: “On the 190. P. E. Gill, W. Murray, M. A. Saunders: User’s Implementation of an Interior Point Filter Line guide for SNOPT: A FORTRAN Package for Search Algorithm for Large-Scale Nonlinear Large-scale Nonlinear Programming, Programming,” 106 Mathematical Technical report, Department of Mathematics, Programming (2006) no. 1, 25 – 57. University of California, San Diego 1998. 178. C. V. Rao, J. B. Rawlings S. Wright: On the 191. J. T. Betts: Practical Methods for Optimal Application of Interior Point Methods to Control Using Nonlinear Programming, Model Predictive Control. J. Optim. Theory Advances in Design and Control 3, SIAM, Appl. 99 (1998) 723. Philadelphia 2001. 179. J. Albuquerque, V. Gopal, G. Staus, L. T. 192. gPROMS User’s Guide, PSE Ltd., London Biegler, B. E. Ydstie: “Interior point SQP 2002. Strategies for Large-scale Structured Process 193. G. E. P. Box: “Evolutionary Operation: A Optimization Problems,” Comp. Chem. Eng. Method for Increasing Industrial Productivity,” 23 (1997) 283. Applied Statistics 6 (1957) 81–101. 180. T. Jockenhoevel, L. T. Biegle, A. Wachter:¨ 194. R. Hooke, T. A. Jeeves: “Direct Search “Dynamic Optimization of the Tennessee Solution of Numerical and Statistical Eastman Process Using the Problems,” J. ACM 8 (1961) 212. Mathematics in Chemical Engineering 145

195. M. J. D. Powell: “An Efficient Method for Mathematical Programming 10 (1976) Finding the Minimum of a Function of Several 147–175. Variables without Calculating Derivatives, 209. H. S. Ryoo, N. V. Sahinidis: “Global Comput. J. 7 (1964) 155. Optimization of Nonconvex NLPs and 196. J. A. Nelder, R. Mead: “A Simplex Method for MINLPs with Applications in Minimization,” Computer Journal 7 Design,” Comp. Chem. Eng. 19 (1995) no. 5, (1965) 308. 551–566. 197. R. Luus, T. H. I. Jaakola: “Direct Search for 210. M. Tawarmalani, N. V. Sahinidis: “Global Complex Systems,” AIChE J 19 (1973) Optimization of Mixed Integer Nonlinear 645–646. Programs: A Theoretical and Computational 198. R. Goulcher, J. C. Long: “The Solution of Study,” Mathematical Programming 99 (2004) Steady State Chemical Engineering no. 3, 563–591. Optimization Problems using a Random 211. C. S. Adjiman., I.P. Androulakis, C.A. Search Algorithm,” Comp. Chem. Eng. 2 Floudas: “Global Optimization of MINLP (1978) 23. Problems in Process Synthesis and Design. 199. J. R. Banga, W. D. Seider: “Global Comp. Chem. Eng., 21 (1997) Suppl., Optimization of Chemical Processes using S445–S450. Stochastic Algorithms,” C. Floudas and P. 212. C.S. Adjiman, I.P. Androulakis and C.A. Pardalos (eds.): State of the Art in Global Floudas, “Global Optimization of Optimization, Kluwer, Dordrecht 1996, p. 563. Mixed-Integer Nonlinear Problems,” AIChE 200. P. J. M. van Laarhoven, E. H. L. Aarts: Journal, 46 (2000) no. 9, 1769–1797. Simulated Annealing: Theory and 213. C. S. Adjiman, I.P. Androulakis, C.D. Applications, Reidel Publishing, Dordrecht Maranas, C. A. Floudas: “A Global 1987. Optimization Method, αBB, for Process 201. J. H. Holland: Adaptations in Natural and Design,” Comp. Chem. Eng., 20 (1996) Suppl., Artificial Systems, University of Michigan S419–S424. Press, Ann Arbor 1975. 214. J. M. Zamora, I. E. Grossmann: “A Branch 202. J. E. Dennis, V. Torczon:, “Direct Search and Contract Algorithm for Problems with Methods on Parallel Machines,” SIAM J. Opt. Concave Univariate, Bilinear and Linear 1 (1991) 448. Fractional Terms,” Journal of Gobal 203. A. R. Conn, K. Scheinberg, P. Toint: “Recent Optimization 14 (1999) no. 3, 217–249. Progress in Unconstrained Nonlinear 215. E. M. B. Smith, C. C. Pantelides: “Global Optimization without Derivatives,” Optimization of Nonconvex NLPs and Mathemtatical Programming, Series B, 79 MINLPs with Applications in Process (1997) no. 3, 397. Design,” Comp. Chem. Eng., 21 (1997) no. 204. M. Eldred: “DAKOTA: A Multilevel Parallel 1001, S791–S796. Object-Oriented Framework for Design 216. J. E. Falk, R. M. Soland: “An Algorithm for Optimization, Parameter Estimation, Separable Nonconvex Programming Uncertainty Problems,” Management Science 15 (1969) Quantification, and Sensitivity Analysis,” 2002, 550–569. http://endo.sandia.gov/DAKOTA/software.html 217. F. A. Al-Khayyal, J. E. Falk: “Jointly 205. A. J. Booker et al.: “A Rigorous Framework Constrained Biconvex Programming,” for Optimization of Expensive Functions by Mathematics of Operations Research 8 (1983) Surrogates,” CRPC Technical Report 98739, 273–286. Rice University, Huston TX, February 1998. 218. F. A. Al-Khayyal: “Jointly Constrained 206. R. Horst, P. M. Tuy: “Global Optimization: Bilinear Programs and Related Problems: An Deterministic Approaches,” 3rd ed., Springer Overview,” Computers and Mathematics with Verlag, Berlin 1996. Applications, 19 (1990) 53–62. 207. I. Quesada, I. E. Grossmann: “A Global 219. I. Quesada, I. E. Grossmann: “A Global Optimization Algorithm for Linear Fractional Optimization Algorithm for Heat Exchanger and Bilinear Programs,” Journal of Global Networks,” Ind. Eng. Chem. Res. 32 (1993) Optimization 6 (1995) no. 1, 39–76. 208. G. P. McCormick: “Computability of Global 487–499. Solutions to Factorable Nonconvex Programs: 220. H. D. Sherali, A. Alameddine: “A New Part I – Convex Underestimating Problems,” Reformulation-Linearization Technique for 146 Mathematics in Chemical Engineering

Bilinear Programming Problems,” Journal of 234. T. J. Van Roy, L. A. Wolsey: “Valid Global Optimization 2 (1992) 379–410. Inequalities for Mixed 0-1 Programs,” Discrete 221. H. D. Sherali, C. H. Tuncbilek: “A Applied Mathematics 14 (1986) 199–213. Reformulation-Convexification Approach for 235. E. L. Johnson, G. L. Nemhauser, M. W. P. Solving Nonconvex Quadratic Programming Savelsbergh: “Progress in Linear Problems,” Journal of Global Optimization 7 Programming Based Branch-and-Bound (1995) 1–31. Algorithms: Exposition,” INFORMSJournal 222. C. A. Floudas: Deterministic Global of Computing 12 (2000) 2–23. Optimization: Theory, Methods and 236. J. F. Benders: “Partitioning Procedures for Applications, Kluwer Academic Publishers, Solving Mixed-variables Programming Dordrecht, The Netherlands, 2000. Problems,” Numeri. Math. 4 (1962) 238–252. 223. M. Tawarmalani, N. Sahinidis: Convexification 237. G. L. Nemhauser, L. A. Wolsey: Integer and and Global Optimization in Continuous and Combinatorial Optimization, Mixed-Integer Nonlinear Programming: Wiley-Interscience, New York 1988. Theory, Algorithms, Software, and 238. C. Barnhart, E. L. Johnson, G. L. Nemhauser, Applications, Kluwer Academic Publishers, M. W. P. Savelsbergh, P. H. Vance: Dordrecht 2002. “Branch-and-price: Column Generation for 224. R. Horst, H. Tuy: Global optimization: Solving Huge Integer Programs,” Operations Deterministic approaches. Springer-Verlag, Research 46 (1998) 316–329. Berlin 1993. 239. O. K. Gupta, V. Ravindran: “Branch and 225. A. Neumaier, O. Shcherbina, W. Huyer, T. Bound Experiments in Convex Nonlinear Vinko: “A Comparison of Complete Global Integer Programming,” Management Science Optimization Solvers,” Math. Programming B 31 (1985) no. 12, 1533–1546. 103 (2005) 335–356. 240. S. Nabar, L. Schrage: “Modeling and Solving 226. L. Lovasz´ A. Schrijver: “Cones of Matrices Nonlinear Integer Programming Problems,” and Set-functions and 0-1 Optimization,” Presented at Annual AIChE Meeting, Chicago SIAM J. Opt. 1 (1991) 166–190. 1991. 227. H. D. Sherali, W.P. Adams: “A Hierarchy of 241. B. Borchers, J.E. Mitchell: “An Improved Relaxations Between the Continuous and Branch and Bound Algorithm for Mixed Convex Hull Representations for Zero-One Integer Nonlinear Programming,” Computers Programming Problems,” SIAM J. Discrete and Operations Research 21 (1994) 359–367. Math. 3 (1990) no. 3, 411–430. 242. R. Stubbs, S. Mehrotra: “A Branch-and-Cut 228. E. Balas, S. Ceria, G. Cornuejols: “A Method for 0-1 Mixed Convex Programming,” Lift-and-Project Cutting Plane Algorithm for Mathematical Programming 86 (1999) no. 3, Mixed 0-1 Programs,” Mathematical 515–532. Programming 58 (1993) 295–324. 243. S. Leyffer: “Integrating SQP and Branch and 229. A. H. Land, A.G. Doig: “An Automatic Bound for Mixed Integer Noninear Method for Solving Discrete Programming Programming,” Computational Optimization Problems,” Econometrica 28 (1960) 497–520. and Applications 18 (2001) 295–309. 244. A. M. Geoffrion: “Generalized Benders 230. E. Balas: “An Additive Algorithm for Solving Decomposition,” Journal of Optimization Linear Programs with Zero-One Variables,” Theory and Applications 10 (1972) no. 4, Operations Research 13 (1965) 517–546. 237–260. 231. R. J. Dakin: “A Tree Search Algorithm for 245. M. A. Duran, I.E. Grossmann: “An Mixed-Integer Programming Problems”, Outer-Approximation Algorithm for a Class of Computer Journal 8 (1965) 250–255. Mixed-integer Nonlinear Programs,” Math 232. R. E. Gomory: “Outline of an Algorithm for Programming 36 (1986) 307. Integer Solutions to Linear Programs,” 246. X. Yuan, S. Zhang, L. Piboleau, S. Domenech Bulletin of the American Mathematics Society : “Une Methode d’optimisation Nonlineare en 64 (1958) 275–278. Variables Mixtes pour la Conception de 233. H. P. Crowder, E. L. Johnson, M. W. Padberg: Procedes,” RAIRO 22 (1988) 331. “Solving Large-Scale Zero-One Linear 247. R. Fletcher, S. Leyffer: “Solving Mixed Programming Problems,” Operations Integer Nonlinear Programs by Outer Research 31 (1983) 803–834. Approximation,” Math Programming 66 (1974) 327. Mathematics in Chemical Engineering 147

248. I. Quesada, I.E. Grossmann: “An LP/NLP 261. R. Porn,¨ T. Westerlund: “A Cutting Plane Based Branch and Bound Algorithm for Method for Minimizing Pseudo-convex Convex MINLP Optimization Problems,” Functions in the Mixed-integer Case,” Comp. Comp. Chem. Eng. 16 (1992) 937–947. Chem. Eng. 24 (2000) 2655–2665. 249. T. Westerlund, F. Pettersson: “A Cutting Plane 262. J. Viswanathan, I. E. Grossmann: “A Method for Solving Convex MINLP Combined Penalty Function and Problems,” Comp. Chem. Eng. 19 (1995) Outer-Approximation Method for MINLP S131–S136. Optimization,” Comp. Chem. Eng. 14 (1990) 250. O. E. Flippo, A. H. G. R. Kan: 769. “Decomposition in General Mathematical 263. A. Brooke, D. Kendrick, A. Meeraus, R. Programming,” Mathematical Programming Raman: GAMS – A User’s Guide, 60 (1993) 361–382. www.gams.com, 1998. 251. T. L. Magnanti, R. T. Wong: “Acclerated 264. N. V. A. Sahinidis, A. Baron: “A General Benders Decomposition: Algorithm Purpose Global Optimization Software Enhancement and Model Selection Criteria,” Package,” Journal of Global Optimization 8 Operations Research 29 (1981) 464–484. (1996) no.2, 201–205. 252. N. V. Sahinidis, I. E. Grossmann: 265. C. A. Schweiger, C. A. Floudas: “Process “Convergence Properties of Generalized Synthesis, Design and Control: A Mixed Benders Decomposition,” Comp. Chem. Eng. Integer Optimal Control Framework,” 15 (1991) 481. Proceedings of DYCOPS-5 on Dynamics and 253. J. E. Kelley Jr.: “The Cutting-Plane Method Control of Process Systems, Corfu, Greece for Solving Convex Programs,” J. SIAM 8 1998, pp. 189–194. (1960) 703–712. 266. R. Raman, I. E. Grossmann: “Modelling and 254. S. Leyffer: “Deterministic Methods for Computational Techniques for Logic Based Mixed-Integer Nonlinear Programming,” Integer Programming,” Comp. Chem. Eng. 18 Ph.D. thesis, Department of Mathematics and (1994) no.7, 563. Computer Science, University of Dundee, 267. J. N. Hooker, M. A. Osorio: “Mixed Dundee 1993. Logical.Linear Programming,” Discrete 255. M. S. Bazaraa, H. D. Sherali, C. M. Shetty: Applied Mathematics 96-97 (1994) 395–442. Nonlinear Programming, John Wiley & Sons, 268. P. V. Hentenryck: Constraint Satisfaction in Inc., New York 1994. Logic Programming, MIT Press, Cambridge, 256. G. R. Kocis, I. E. Grossmann: “Relaxation MA, 1989. Strategy for the Structural Optimization of 269. J. N. Hooker: Logic-Based Methods for Process Flowsheets,” Ind. Eng. Chem. Res. 26 Optimization: Combining Optimization and (1987) 1869. Constraint Satisfaction, John Wiley & Sons, 257. I. E. Grossmann: “Mixed-Integer Optimization New York 2000. Techniques for Algorithmic Process 270. E. Balas: “Disjunctive Programming,” Annals Synthesis”, Advances in Chemical of Discrete Mathematics 5 (1979) 3–51. Engineering, vol. 23, Process Synthesis, 271. H. P. Williams: Mathematical Building in Academic Press, London 1996, pp. 171–246. Mathematical Programming, John Wiley, 258. E. M. B. Smith, C. C. Pantelides: “A Symbolic Chichester 1985. Reformulation/Spatial Branch and Bound 272. R. Raman, I. E. Grossmann: “Relation Algorithm for the Global Optimization of between MILP Modelling and Logical Nonconvex MINLPs,” Comp. Chem. Eng. 23 Inference for Chemical Process Synthesis,” (1999) 457–478. Comp. Chem. Eng. 15 (1991) no. 2, 73. 259. P. Kesavan P. P. I. Barton: “Generalized 273. S. Lee, I. E. Grossmann: “New Algorithms for Branch-and-cut Framework for Mixed-integer Nonlinear Generalized Disjunctive Nonlinear Optimization Problems,” Comp. Programming,” Comp. Chem. Eng. 24 (2000) Chem. Eng. 24 (2000) 1361–1366. no. 9-10, 2125–2141. 260. S. Lee I. E. Grossmann: “A Global 274. J. Hiriart-Urruty, C. Lemarechal:´ Convex Optimization Algorithm for Nonconvex Analysis and Minimization Algorithms, Generalized Disjunctive Programming and Springer-Verlag, Berlin, New York 1993. Applications to Process Systems,” Comp. 275. E. Balas: “Disjunctive Programming and a Chem. Eng. 25 (2001) 1675–1697. Hierarchy of Relaxations for Discrete 148 Mathematics in Chemical Engineering

Optimization Problems,” SIAM J. Alg. Disc. 289. H. G. Bock, K. J. Plitt: A Multiple Shooting Meth. 6 (1985) 466–486. Algorithm for Direct Solution of Optimal 276. I. E. Grossmann, S. Lee: “Generalized Control Problems, 9th IFAC World Congress, Disjunctive Programming: Nonlinear Convex Budapest 1984. Hull Relaxation and Algorithms,” 290. D. B. Leineweber, H. G. Bock, J. P. Schloder,¨ Computational Optimization and Applications J. V. Gallitzendorfer,¨ A. Schafer,¨ P. Jansohn: 26 (2003) 83–100. “A Boundary Value Problem Approach to the 277. S. Lee, I. E. Grossmann: “Logic-based Optimization of Chemical Processes Described Modeling and Solution of Nonlinear by DAE Models,” Computers & Chemical Discrete/Continuous Optimization Problems,” Engineering, April 1997. (IWR-Preprint Annals of Operations Research 139 2005 97-14, Universitat¨ Heidelberg, March 1997. 267–288. 291. J. E. Cuthrell, L. T. Biegler: “On the 278. ILOG, Gentilly Cedex, France 1999, Optimization of Differential- algebraic Process www.ilog.com, Systems,” AIChE Journal 33 (1987) 279. M. Dincbas, P. Van Hentenryck, H. Simonis, 1257–1270. A. Aggoun, T. Graf, F. Berthier: The 292. A. Cervantes, L. T. Biegler: “Large-scale DAE Constraint Logic Programming Language Optimization Using Simultaneous Nonlinear CHIP, FGCS-88: Proceedings of International Programming Formulations. AIChE Journal Conference on Fifth Generation Computer 44 (1998) 1038. Systems, Tokyo, 693–702. 293. P. Tanartkit, L. T. Biegler: “Stable 280. M. Wallace, S. Novello, J. Schimpf: Decomposition for Dynamic Optimization,” “ECLiPSe: a Platform for Constraint Logic Ind. Eng. Chem. Res. 34 (1995) 1253–1266. Programming,” ICL Systems Journal 12 294. S. Kameswaran, L. T. Biegler: “Simultaneous (1997) no.1, 159–200. Dynamic Optimization Strategies: Recent 281. V. Pontryagin, V. Boltyanskii, R. Gamkrelidge, Advances and Challenges,” Chemical Process E. Mishchenko: The Mathematical Theory of Control – 7, to appear 2006. Optimal Processes, Interscience Publishers 295. B. van Bloemen Waanders, R. Bartlett, K. Inc., New York, NY, 1962. Long, P. Boggs, A. Salinger: “Large Scale 282. A. E. Bryson, Y. C. Ho: “Applied Optimal Non-Linear Programming for PDE Control: Optimization, Estimation, and Constrained Optimization,” Sandia Technical Control,” Ginn and Company, Waltham, MA, Report SAND2002-3198, October 2002. 1969. 296. G. P. Box, J. S. Hunter, W. G. Hunter: 283. V. Vassiliadis, PhD Thesis, Imperial College, Statistics for Experimenters: Design, University of London 1993. Innovation, and Discovery, 2nd ed., John 284. W. F. Feehery, P. I. Barton: “Dynamic Wiley & Sons, New York 2005. Optimization with State Variable Path 297. D. C. Baird: Experimentation: An Introduction Constraints,” Comp. Chem. Eng., 22 (1998) to Measurement Theory and Experiment 1241–1256. Design, 3rd ed., Prentice Hall, Engelwood 285. L. Hasdorff:. Gradient Optimization and Cliffs, NJ, 1995. Nonlinear Control, Wiley-Interscience, New 298. J. B. Cropley: “Heuristic Approach to ACS Symp. Ser. 65 York, NY, 1976. Comples Kinetics,” (1978) 286. R. W. H. Sargent, G. R. Sullivan: 292 – 302. 299. S. Lipschutz, J. J. Schiller, Jr: Schaum’s “Development of Feed Changeover Policies Outline of Theory and Problems of for Refinery Distillation Units,” Ind. Eng. Introduction to Probability and Statistics, Chem. Process Des. Dev. 18 (1979) 113–124. McGraw-Hill, New York 1988. 287. V. Vassiliadis, R. W. H. Sargent, C. Pantelides: 300. D. S. Moore, G. P. McCabe: Introduction to “Solution of a Class of Multistage Dynamic the Practice of Statistics,” 4th ed., Freeman, Optimization Problems,” I & EC Research 33 New York 2003. (1994) 2123. 301. D. C. Montgomery, G. C. Runger: Applied 288. K. F. Bloss, L. T. Biegler, W. E. Schiesser: Statistics and Probability for Engineers,3rd “Dynamic Process Optimization through ed., John Wiley & Sons, New York 2002. Adjoint Formulations and Constraint 302. D. C. Montgomery, G. C. Runger, N. F. Aggregation,”. Ind. Eng. Chem. Res. 38 (1999) Hubele: Engineering Statistics, 3rd ed., John 421–432. Wiley & Sons, New York 2004. Mathematics in Chemical Engineering 149

303. G. E. P. Box, W. G. Hunter, J. S. Hunter: Process Design, Prentice-Hall Englewood Statistics for Experimenters, John Wiley & Cliffs, NJ 1997. Sons, New York 1978. 307. I. E. Grossmann (ed.), “Global Optimization in 304. B. W. Lindgren: Statistical Theory, Engineering Design”, Kluwer, Dordrecht Macmillan, New York 1962. 1996. 305. O. A. Hougen, K. M. Watson, R. A. Ragatz: 308. J. Kallrath, “Mixed Integer Optimization in the Chemical Process Principles, 2nd ed., part II, Chemical Process Industry: Experience, “Thermodynamics,” J. Wiley & Sons, New Potential and Future,” Trans. I .Chem E., 78 York 1959. (2000) Part A, 809–822. 306. L. T. Biegler, I. E. Grossmann, A. W. 309. J. N. Hooker: Logic-Based Methods for Westerberg: Systematic Methods for Chemical Optimization, John Wiley & Sons, New York 1999.