Energy Conversion and Management 49 (2008) 3654–3665

Contents lists available at ScienceDirect

Energy Conversion and Management

journal homepage: www.elsevier.com/locate/enconman

Multiple faults diagnosis for sensors in air handling unit using Fisher discriminant analysis

Zhimin Du *, Xinqiao Jin

School of Mechanical Engineering, Shanghai Jiao Tong University, 800, Dongchuan Road, Shanghai 200240, China article info abstract

Article history: This paper presents a data-driven method based on principal component analysis and Fisher discriminant Received 26 November 2007 analysis to detect and diagnose multiple faults including fixed bias, drifting bias, complete failure of sen- Accepted 29 June 2008 sors, air damper stuck and water valve stuck occurred in the air handling units. Multi-level strategies are Available online 15 August 2008 developed to improve the diagnosis efficiency. Firstly, system-level PCA model I based on energy balance is used to detect the abnormity in view of system. Then the local-level PCA model A and B based on supply Keywords: air temperature and outdoor air flow rate control loops are used to further detect the occurrence of faults Multiple faults and pre-diagnose them into various locations. As a linear dimensionality reduction technique, moreover, Principal component analysis Fisher discriminant analysis is presented to diagnose the fault source after pre-diagnosis. With Fisher Fisher discriminant analysis Air handling unit transformation, all of the data classes including normal and faulty operation can be re-arrayed in a Detection and diagnosis transformed data space and as a result separated. Comparing the Mahalanobis distances (MDs) of all the candidates, the least one can be identified as the fault source. Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction fault detection and diagnosis methods or strategies that can be summarized into two classes. One is the model-based method, To satisfy the increasing demand of indoor air quality (IAQ) and and the other is the data-driven method. energy conservation, the optimal control strategies of air handling The model-based method is most widely used and developed. unit (AHU) become more and more complex. In the system com- Stylianou and Nikanour [14] used a first-order model to detect posing of AHU and affiliated facilities, the measurements of the faults of temperature sensors by comparing the actual temperature temperature, pressure and flow rate sensors not only indicate the decay with the model output using the hypothesis testing. Wang operation condition, but also play essential role in the different and Wang [15] developed a model-based sensor fault diagnosing feedback control loops. Without the accuracy of the sensor strategy, which took all the commonly used temperature and flow measurements, the controllers may be misled and give incorrect rate sensors in chilling plant into account at the same time. The actions. Consequently, performance degradation, damage to com- model-based method is efficient to detect the complete failure of ponent, waste in energy consumption and decrease of IAQ may sensors through monitoring and analyzing the change of operation happen. As for a long-term used system, actually, one or more sen- condition after the abrupt change of sensor measurement happen. sor faults including complete failure, fixed bias, drifting bias and As to the fixed and drifting bias of sensors, however, it is insensi- precision degradation may occur inevitably. The complete failure tive because the occurrence of these kinds of faults may result in of the sensors may lead to the faulty or dangerous actions of the not the abrupt hard fault but the slow degradation of the operation controller, decrease the life of the facility or even damage them. condition and control efficiency. The fixed or drifting bias may decrease the control efficiency of The data-driven approaches, typically as some methods the controller that result in the invalidation of the advanced opti- [16–19], were presented in HVAC systems recently. With the pro- mal strategies. Therefore, it is necessary to develop suitable meth- cess data collected from both normal and abnormal conditions, the ods to detect and diagnose the sensor faults occurred in the AHU correlation among variables can be analysed. Accordingly, the system. Recently, the study of sensor fault detection and diagnosis intrinsic relationship among those variables can be obtained. Actu- (FDD) are more active in deed. ally, it is the reflection of the corresponding physical models that Based on Annex25 [1] and Annex34 [2], Many studies [3–13], are usually difficult to build for HVAC systems. Obviously, the concerning various faults of facilities and sensors in heating, venti- data-driven method highly relies on the quantity and quality of lation and air conditioning (HVAC) systems, developed kinds of the data obtained. Fortunately, with the popularity of building automatic (BA) and energy management and control systems * Corresponding author. Tel./fax: +86 21 34206774. (EMCS), the operation data including measurements and control E-mail addresses: [email protected] (Z. Du), [email protected] (X. Jin). signals can be collected or obtained easily. Wang and Xiao [16]

0196-8904/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.enconman.2008.06.032 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3655

Nomenclature

x measure vector HVAC heating, ventilation and air conditioning L loading matrix FDD fault detection and diagnosis P model projection matrix G data class Greek symbols Sw within-class scatter matrix Qa threshold for SPE Sb between-class scatter matrix u Fisher optimal discriminant direction k eigen value l vector l eigen vector y discriminant function Subscripts and superscripts T temperature (°C) ^ modelled part of a vector M flow rate (kg/s) un-modelled part of a vector h relative humidity (%) – mean of samples CX control variable k number of the classes RX related variable in a control loop set set point C control signal sup supply air PCA principal component analysis fre outdoor air SPE square prediction error rtn return air FDA Fisher discriminant analysis w water MD Mahalanobis distance ws supply water AHU air handling unit wr return water IAQ indoor air quality

presented principal component analysis (PCA) to detect single sen- x ¼ ^x þ x~ ð1Þ sor fault occurred in the air handing unit. Subsequently, PCA-based where strategies were applied in the variable air volume (VAV) [17,20] T and centrifugal chilling system [18,21]. Actually, as a statistic ^x ¼ PL ð2Þ method, PCA-based strategy can quickly discover the abnormity is the modelled part that represents the projection on the principal occurred in the system after learning the normal condition. How- component subspace showing the normal conditions, while ever, its isolation ability is unsatisfied although contribution plot ee [16], joint angle analysis [20] and knowledge-based analysis [22] ~x ¼ P LT ð3Þ have been incorporated. Also as a statistic method, Fisher discrim- is the un-modelled part on the residual subspace indicating the inant analysis (FDA) is widely used in pattern classification origi- abnormities or faults. With this decomposition, the measurement nally [23]. It is a linear dimensionality reduction technique that space can be divided into two orthogonal subspaces: principal com- can optimize the separation among different data classes. Because ponent subspace and residual subspace. The former refers to the of this characteristic, it can be used to isolate different fault classes condition that normal data variation occurs, while the latter refers so as to diagnose the fault source. As a promising diagnosis meth- to the condition that abnormal variation or noise may occur. Under od, however, FDA still has not been used in the HVAC field at pres- normal operation conditions, most of the projection of x is on prin- ent. Moreover, the multiple faults issue, which was seldom cipal component subspace, while the projection on the residual sub- concerned in the past studies, needs to be paid more attention space is very little. When a fault occurs, however, the projection on since the occurrence of several faults is always inevitable in the residual subspace can be greatly increased. system after a long-term operation. Normal and faulty operation can be distinguished using the Therefore, FDA incorporated with multi-level strategies is pre- squared prediction error (SPE). If Eq. (4) is satisfied, it nor- sented in this paper to diagnose the multiple faults in AHU after mal operation. On the contrary, it indicates faulty operation or they are detected using PCA method. Firstly, Multi-PCA models abnormity. including system-level and local-level are used to detect whether 2 there is any abnormity in view of different levels. Multiple detect- SPEðxÞ¼k~xk 6 Q a ð4Þ ing models can not only confirm the occurrence of faults but also where Qa denotes a confidence limit or threshold for the SPE. pre-diagnose them into related locations. In addition, FDA is devel- oped to diagnose the corresponding fault source in different local 2.2. Geometric interpretation of Fisher discriminant analysis control loops. With a series of Fisher transformation, FDA can sep- arate different fault classes optimally through maximizing the Fisher discriminant analysis [29,30] is a linear dimensionality scatter between classes while minimizing the scatter within clas- reduction technique, optimal in terms of maximizing the separa- ses. Then the faulty sensor can be isolated through comparing tion among different classes. Through a series of linear transforma- the Mahalanobis distance (MD) for all of the candidates. tion, FDA technique can maximize the scatter between the classes and minimize the scatter within the classes. Consequently, various 2. Fault diagnostics methodology classes can be re-arrayed and separated in the transformed data space. This property of FDA can be used to isolate the fault source. 2.1. Overview of principal component analysis 2.2.1. Fisher discriminant analysis According to the PCA method [24–28], a measurement matrix x All the operation data from the measurement system, including which describes a series of operation conditions of the system, can normal operation and faulty operation, can be classified to differ- be decomposed into two orthogonal parts, ent data classes 3656 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665

Gi : G1; ...; Gk ði ¼ 1; ...; kÞð5Þ Therefore, the fault diagnosis problem is to compare the MDs of all the candidates and select the minimum (Fig. 1). where G1 refers to the normal operation data class, G2 ...Gk refer to the various faulty data classes. minfMDig)i ð14Þ i¼1;k With the ni rows of samples from class Gi, supposing xi is a m- dimensional sample mean from class i that can be denoted as If i = 1, it means that the new data belong to G1 indicating normal 1 X operation. If i =2,..., k, on the other hand, it means the new data x x 6 i ¼ ð Þ belong to G2, ..., Gk indicating various fault conditions. ni x2Gi The mean of all the samples is 3. Multi-level FDD models for AHU 1 Xk X x ¼ x ð7Þ n i¼1 x2Gi 3.1. Description of AHU where n is total number of all the samples, then the within-class Typical AHU and the affiliated facilities are shown in Fig. 2, scatter matrix can be given by which are composed of air handling unit, supply fan, return Xk X fan, air ducts, air dampers, water valve, controllers and sensors. S ¼ ðx x Þðx x ÞT ð8Þ w i i The supply air, which is the mixture of outdoor air and the re- i¼1 x2G i cycle air, is circulated to AHU by the variable-speed supply fan And the between-class scatter matrix can be given by and exchanges heat with the chilled water. After being cooled Xk down (in summer condition) by the chilled water, it is circulated T Sb ¼ niðxi xÞðxi xÞ ð9Þ to the terminals to meet the indoor requirement. Finally, with i¼1 the variable-speed return fan, the return air may be divided Therefore the optimal discriminant direction is obtained by maxi- into the exhaust air and the recycle air, which are discharged mizing the Fisher criterion: into the outside space and reemployed to another air circle, respectively. uT S u b In addition, four controllers are included to ensure the efficient JðuÞ¼ T ð10Þ u Swu operation of AHU. The first is the Tsup controller to ensure proper where u is the Fisher optimal discriminant direction which maxi- supply air temperature through adjusting the chilled water valve mizes the between-class scatter but minimizes the within-class locating at the inlet of the AHU. The second is the Mfre controller scatter. If Sw is nonsingular, the optimal problem can be transferred to maintain the outdoor air requirement through adjusting the into a conventional eigenvalue problem by writing outdoor, recycle and exhaust air dampers. The third is the supply air static pressure controller to ensure the air circulation through S1S u ¼ ku ð11Þ w b modulating the speed of supply fan. The last one is the return 1 where k =(k1,k2, ...,kr are the eigenvalues of Sw Sb, and the corre- fan controller to ensure certain indoor positive pressure. To ensure sponding eigenvectors can be denoted as l =(l1,l2, ...,lr). normal working of these controllers, the accuracy of concerning So the discriminant function can be given by sensors are significant of course.

T y ¼ l x ð12Þ 3.2. Multi-level diagnosis models With the Fisher transformation shown in Fig. 1, the various classes from the measurement space can be arrayed again and separated in 3.2.1. System-level model based on energy balance another space. As the crucial balance in AHU, the energy balance combining each part of the system represents not only the air mixing process, 2.2.2. Mahalanobis distance but the heat exchange process in air handling unit between the air With the discriminant function, the MD can be used to identify and the water. And it describes the deep relations among variables that can be described as certain function (Eq. (15))ofT , T , T , which class of Gi the new measure data belong to. fre sup rtn T T , T , M ,M ,M , M , h and h [20]. If a new sample is denoted as x =(x1, x2,..., xm) , then ws wr fre sup rtn w fre rtn X T 1 BalanceEnergy ¼ FðTfre; Tsup; Trtn; Tws; Twr; Mfre; Msup; Mrtn; Mw; hfre; hrtnÞ MDi ¼ðx liÞ ðx liÞð13Þ ð15Þ T where li =(li1,li2, ..., lim) is the mean vector of Gi.

Point ∈ Class G G MD1< MD2 1 Y2 1 X2 Point ∈ Class G 2 MD1 ∈ G G1 1 Fisher G2 Transformation MD2 G2

X1 Y1

Fig. 1. Classes separation based on Fisher transformation. Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3657

Tsup ControllerController VA V VA V VA V VA V terminal terminal terminal terminal

Sup p ly Fan T P Outdoor Air F F Sup p ly Air

Air handling unit Controller Controller Controller

Mfre VA V VA V VA V VA V Contr oller terminal terminal terminal terminal Recycle Air Recycle

Contr oller

Exhaust Air F Return Air F Flow Sensor P Pressure Sensor Return Fan T Temperature Sensor

Fig. 2. AHU and affiliated facilities. where T refers to temperature, M refers to flow rate, h refers to rel- The main relevant variables in this loop include control variable ative humidity, and the subscripts fre, sup and rtn refer to outdoor, CX, its setpoint CXset, control signal CCX, and related variables supply and return air respectively. RXi(i =1,...,k). Since these variables have a strong relationship be- In other words, these measure variables can be combined to- cause of the feedback control loop, the local-level PCA models can gether and have some relations deeply reflected in the energy bal- be built and trained using the measurements of these sensors. ance of AHU. Therefore, the system-level PCA model I can be set up Therefore, the training matrix of local-level PCA models based on using these eleven measure variables. And the corresponding train- control loops can be illustrated as 2 3 ing matrix is denoted as 1 1 1 1 1 2 3 CX RX1RX2 RXk CCX 1 1 1 1 1 1 1 1 1 1 1 6 7 2 2 2 2 2 TsupMwTwsTwrMfreMsupMrtnTfreTrtnhfrehrtn 6 7 6 7 6 CX RX1RX2 RXk CCX 7 6 2 2 2 2 2 2 2 2 2 2 2 7 L ¼ 6 7 ð17Þ 6 T M T T M M M T T h h 7 4 ...... 5 I ¼ 6 sup w ws wr fre sup rtn fre rtn fre rtn 7 ð16Þ 4 ...... 5 n n n n n CX RX1RX2 RXk CCX nðkþ2Þ Tn Mn Tn Tn Mn Mn Mn Tn Tn hn hn sup w ws wr fre sup rtn fre rtn fre rtn n11 As to the supply air temperature control loop, Tsup, Tsup, set, Cw (water valve control signal), Mw (chilled water flow rate), Tws (sup- 3.2.2. Local-level models based on controllers ply water temperature) and Twr (return water temperature) are the Besides the energy balance, the local control loops in the system relevant variables. Consequently, the training matrix of local-level can be used to build the local-level models. Fig. 3 is a typical feed- detection model based on Tsup control loop can be built as back control loop in AHU that includes sensors, controller, actuator 2 3 T1 M1 T1 T1 T1 C1 and facility. Comparing the measurements of control variable (CX) 6 sup w ws wr sup;set w 7 6 2 2 2 2 2 2 7 from sensor with its setpoint (CX ), the Proportional–Integral– 6 T M T T T C 7 set A ¼ 6 sup w ws wr sup;set w 7 ð18Þ Differential (PID) controller may calculate the difference and give 4 ...... 5 the control command (CCX) to the actuator. With these series of n n n n n n T M T T T ; C the control signals, the actuator may gradually modulate and ad- sup w ws wr sup set w n6 just the facility to maintain the control variable (CX) at the Similar to the outdoor air flow rate control loop, since Mfre,Mfre, set, setpoint. Cfre (outdoor air damper control signal), Msup and Mrtn are relevant,

RX1 … RXk Fault Faulult Fault Sesnsor

CXset CCX Controller Actuator Facility Setpoint

CX Sensor

Fault

Fig. 3. Typical feedback control loop. 3658 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665

the training matrix of local-level detection model based on Mfre (1) After being divided into three groups according to matrix I,A control loop can be described as and B, the historical normal operation data should be scaled to 2 3 zero mean and unit ; M1 M1 M1 M1 C1 6 fre sup rtn fre;set fre 7 (2) With the number of principal components optimized, the cor- 6 2 2 2 2 2 7 6 M M M M C 7 relation matrix of model I, A and B can be obtained under nor- B ¼ 6 fre sup rtn fre;set fre 7 ð19Þ 4 ...... 5 mal operation. With the eigenvalues and eigenvectors n n n n n calculated, principal component analysis model I, A and B M M M M C fre sup rtn fre;set fre n5 can be set up by partitioning the measurements into two Therefore, multiple models including system-level and local-level orthogonal subspaces: principal component subspace and res- developed can be used to detect the multiple faults occurred in idue subspace. AHU. As initial detection, system-level model is used to discover faults or abnormities in view of the whole system. While the two With the new measurements, the multiple models are used to local-level models are used to not only confirm the occurrence of detect whether there is any abnormity in the system through com- faults, but also pre-diagnose the faults into different locations so paring their SPE values with the threshold. If any abnormity is dis- as to improve the diagnosing process. covered, simple rules are used to pre-diagnose the fault source. Because of multi-level models, four simple rules (YA & YB, YA & 3.3. Strategies of multiple faults detection and diagnosis NB, NA & YB and NA &NB) can be summarized in Fig. 4. With these rules, the fault sources can be pre-isolated into different local con-

The logic and strategies of detection and diagnosis for multiple trollers (Tsup or Mfre controller). faults in AHU are illustrated in Fig. 4. Secondly, as for different locations (based on model A or B), dif- Firstly, multi-level PCA models, which include one system-level ferent fault classes are optimally separated in a new data space to model and two local-level models, are built through the following improve further diagnosis through Fisher transformation: steps:

Fig. 4. Multiple faults detection and diagnosis logic. Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3659

(1) With historical data, different classes are classified, which generate different kinds of faults. Multiple faults including fixed refer to different faulty operation conditions of the system. bias, drifting bias, complete failure of sensors, air damper stuck And the within-class scatter matrix and between-class scatter and water valve stuck are tested to validate the detection and diag- matrix are both obtained. nosis efficiency. (2) In order to maximize the between-class scatter while mini- mize the within-class scatter, eigenvalues and eigenvectors 4.1. Complete failure of sensors 1 of Sw Sb are calculated. (3) With the eigenvalues and eigenvectors optimal selected, Case 1. Tsup and M fre sensors completely failed at 12PM and 1PM Fisher discriminant function is obtained and then the corre- respectively; spnding Fisher transformatiom is carried on. As to the complete failure of sensors, firstly, the abnormities can Finally, FDA-based Mahalanobis distance is used to isolate the be discovered through system-level detection based on PCA model faulty sensor since various fault classed have been separted in a I because the SPE values exceed the threshold (3.1713) in Fig. 5a. new transformed space. A seried of MDs for all of the candidate Moreover, local-level detection of the PCA model A can be used sources are calculated and compared. The sensir with the least to confirm the occurrence of fault in the Tsup control loop by com- MD is the fault source in deed. paring the SPE values with its threshold (1.1418) in Fig. 5b. And an- other local-level detection of the PCA model B can be used to

confirm some fault occurred in the Mfre control loop by making 4. Cases and validation the similar comparison in Fig. 5c.

Detection and diagnosis strategies for multiple faults in AHU are 4.2. Damper/valve faults tested in the simulator developed [31]. Typical weather data is pre- processed and used to build the training models. A fault generator Case 2. Outdoor air damper stuck at close at 12PM and coil water has been incorporated into the simulator of the system, which can valve stuck at 40% at 10AM;

100 Cooling coil valve stuck at 40% at 10AM, Outdoor air damper stuck at close at 12PM Complete failure: Tsup(12PM occurred) 80 and Mfre(1PM occurred) Threshold(3.1713)

60 SPE

40

20

0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time (a) Fault detection based on PCA model I

90 8 Cooling coil valve stuck at 40% at 10AM, Cooling coil valve stuck at 40% at 10AM, Outdoor air damper stuck at close at 12PM Outdoor air damper stuck at close at 12PM Complete failure: Tsup(12PM occurred) Complete failure: Tsup(12PM occurred) and and Mfre(1PM occurred) Mfre(1PM occurred) Threshold(1.1418) 6 Threshold(0.0780) 60

4 SPE SPE

30 2

0 0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time (b) Fault detection based on PCA model A (c) Fault detection based on PCA model B

Fig. 5. Fault detection for case 1 and 2 (a) fault detection based on PCA model I, (b) fault detection based on PCA model A and (c) fault detection based on PCA model B. 3660 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665

When the outdoor air damper and coil water valve both stuck, of fault in the Tsup control loop by comparing the SPE values with its the system-level detection module can quickly discover the abnor- threshold in Fig. 6b. On the other hand, local-level detection of the mity in view of system (Fig. 5a). Subsequently, the local-level PCA model B can be used to confirm no fault occurred in the Mfre detection module can separately confirm the faults occurred in control loop by making the similar comparison in Fig. 6c. the corresponding loop (Fig. 5b and c). As to Case 4 that Msup sensor biased with 20% at 12PM, similar The results of Cases 1 and 2 validate detecting efficiency for the analysis and comparison can be made. Through comparing the SPE hard faults of the multi-level strategies presented in this paper. values with their thresholds of PCA model I, A and B, conclusion

And the detection and diagnosis for fixed and drifting biases of sen- can be drawn that it was normal in the Tsup control loop while sors are discussed in the following sections. some fault occurred in the Mfre control loop (Fig. 6).

4.3. Fixed biases of sensors 4.3.2. Multiple faults As to the multiple fixed biases of sensors, Cases 5 and 6 are 4.3.1. Single fault tested and discussed.

Case 3. Twr sensor biased with 7% at 12PM Case 5. Tsup and Mfre sensors biased with 8% and 20% at 12:30PM respectively;

Case 4. Msup sensor biased with 20% at 12PM When the sensors of Tsup and Mfre are, respectively biased with

Single sensor fault detection results of Twr and Msup are illustrated 8% and 20% at 12:30PM simultaneously, system-level detection in Fig. 6. Under the normal condition, the SPE values are all less than based on PCA model I indicates the abnormity of the system be- the corresponding thresholds of the system-level and local-level PCA cause the SPE values exceed the threshold (3.1713) after models (I, A and B) indicating no fault existed in the system. 12:30PM (Fig. 7a). On the other hand, the local-level detection

When Twr sensor is biased with 7% at 12PM, the SPE values of the (Fig. 7b, c) based on PCA model A and B illustrates that faults system exceed the threshold of PCA model I after 12PM (Fig. 6a) occurred in both Tsup and Mfre control loops because the SPE indicating the abnormity of the system. Furthermore, local-level values of the two loops go beyond their thresholds (1.1418 and detection of the PCA model A can be used to confirm the occurrence 2.8197).

12

Twr sensor biased with 7% at 12PM Msup sensor biased with -20% at 12PM Normal operation Threshold(3.1713) 8 Faulty Operation (12PM-4PM)

Normal Operation SPE (9AM-12PM)

4

0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time (a) Fault detection based on PCA model I

12 9

Twr sensor biased with 7% at 12PM Twr sensor biased with 7% at 12PM Msup sensor biased with -20% at 12PM Msup sensor biased with -20% at 12PM Normal operation Normal operation Faulty Operation Threshold(2.8197) 9 Threshold(1.1418) (12PM-4PM) Faulty Operation (12PM-4PM) 6

Normal Operation 6 SPE

SPE (9AM-12PM) Normal Operation (9AM-12PM) 3

3

0 0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time (b) Fault detection based on PCA model A (c) Fault detection based on PCA model B

Fig. 6. Fault detection for Cases 3 and 4. Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3661

Tsup and Mfre sensor biased with 8% and 20% at 12:30PM respectively 25 Threshold(3.1713)

20

15 SPE

10

5

0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time (a) Fault detection based on PCA model I

12 12 Tsup and Mfre sensor biased with 8% and 20% at 12:30PM respectively Tsup and Mfre sensor biased with 8% and 25% at 12:30PM respectively

Threshold(1.1418) Threshold(2.8197) 9 9

6 6 SPE SPE

3 3

0 0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time (b) Fault detection based on PCA model A (c) Fault detection based on PCA model B

Fig. 7. Fault detection for Case 5.

Since the multiple faults are pre-isolated into the corresponding transformation space for the Tsup control loop, while three classes local control loops, FDA can be used to diagnose the fault sources. are separated for the Mfre control loop. Finally, the fault sources With Fisher transformation optimizing the discriminant direction, can be isolated through comparing the MDs of the candidates in the fault classes in each control loop can be re-arrayed in a trans- each control loop. Obviously, the Tsup and Mfre sensors can be con- formed data space and thus be separated in this new space. As firmed to be the fault sources because their MDs are the least ones shown in Figs. 8 and 9, five classes are separated in the Fisher in the corresponding loops (Fig. 10).

0.32 Tsup sensor fault class G Twr sensor fault class 4 Tws sensor fault class 0.3 Mw sensor fault class Normal operation class 0.234 0.28 0.232 G1

0.23

Y2 0.26 G 0.228 G2 5 Y2 0.226 0.24

0.224 Zoom in G3 0.22 0.222

0.22 -3.196 -3.194 -3.192 -3.19 -3.188 -3.186 -3.184 -3.182 -3.18 Y1 0.2 -3.25 -3.15 -3.05 -2.95 -2.85 -2.75 -2.65 -2.55 Y1

Fig. 8. Separation of different classes in Tsup control loop. 3662 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665

1.2

1.15 Mfre sensor fault class Mrtn sensor fault class 1.1 G2 Msup sensor fault class 1.05

1 G1 0.95 Y2

0.9

0.85 G3 0.8

0.75

0.7 -1.3 -1.1 -0.9 -0.7 -0.5 -0.3 -0.1 0.1 Y1

Fig. 9. Separation of different classes in Mfre control loop.

60 12000 50000 Normal operation

10000 Tsup 50 Twr 8000 Tws 40000 Mfre Mw Msup 40 6000 Mrtn 4000 30000

30 2000 MD

12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM MD 20000 20

10000 10

0 0 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time

(a) Fault diagnosis in Tsup control loop (b) Fault diagnosis in Mfre control loop

Fig. 10. Fault diagnosis for Case 5.

Case 6. Twr and Msup sensors biased with 7% and 20% at drifting processes are exaggerated in this paper to make the 12:30PM respectively; simulation convenient.

Similar analysis of detection and diagnosis to Case 6 can be Case 7. Tsup and Mfre sensors drifted with 0.01 °C/min and made. Firstly, with multi-level detection based on PCA model I,A 0.0012 kg/s/min at 10AM respectively; and B, the faults can be discovered and pre-diagnosed into both

Tsup and Mfre control loops because their SPE values exceed the Case 8. Tsup and Msup sensors drifted with 0.01 °C /min and thresholds (Fig. 11). In addition, FDA-based diagnosing module 0.004 kg/s/min at 11AM respectively; can further isolate the fault source using the transformed Fisher Two drifting cases are tested to validate the FDD strategies. space. Since the MDs of the T and M sensors are the least in wr sup When T and M sensors drifted with 0.01 °C /min and 0.0012 the two control loops (Fig. 12), these two sensors can be identified sup fre kg/s/min at 10AM respectively, the system-level detection based as the fault sources. on PCA model I indicates the abnormity after 12PM because the SPE value exceed the threshold (Fig. 13a). On the other hand, the 4.4. Drifting biases of sensors local-level detection based on PCA model A and B can confirm

the occurrence of faults in both Tsup and Mfre control loops (Fig. Drifting biases of sensors widely exist in the actual systems 13b,c). Similar analysis to Case 8 shows that two drifting biases after a long-term operation. The magnitudes of the drifting can be successfully detected and pre-diagnosed into the two local biases are usually very little at the beginning. With the slow loops (Fig. 13). drifting, the magnitude may increase gradually until it is as The diagnosis results for Case 7 are shown in Fig. 14. With Fish- big as which can affect the controller mistakenly. It is necessary er transformation, the MDs of all the candidates are calculated and to discuss the drifting bias issue. However, it is inconvenient to compared. Because the MDsofTsup and Mfre sensors are the least in track the whole drifting process since it is always as long as sev- the corresponding loop, they can be identified as the fault sources. eral months or even a year. Consequently, change rates of the Similar analysis can be made for Case 8. Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3663

18 Twr and Msup sensor biased with 7% and -20% at 12:30PM respectively

15 Threshold(3.1713)

12

9 SPE

6

3

0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time (a) Fault detection based on PCA model I

12 12 Twr and Msup sensor biased with 7% and -20% at 12:30PM respectively Twr and Msup sensor biased with 7% and -20% at 12:30PM respectively

Threshold(1.1418) Threshold(2.8197) 9 9

6 6 SPE SPE

3 3

0 0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time (b) Fault detection based on PCA model A (c) Fault detection based on PCA model B

Fig. 11. Fault detection for Case 6.

200 80 12000 32300 Normal operation 10000 Tsup 32100

8000 Twr 160 60 Tws 31900 6000 Mw Msup Mrtn 4000 120 31700 Mfre 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 2000

40 MD

MD 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 80

20 40

0 0 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time

Fig. 12. Fault diagnosis for Case 6.

5. Conclusions ing bias, complete failure of sensors, air damper stuck and water valve stuck occurred in the air handling units. A data-driven method, Fisher discriminant analysis incorpo- The system-level PCA model I is used to judge whether any rated with principal component analysis, was presented in this pa- abnormity occurred in view of system. While the local-level PCA per. Multi-level strategies based on PCA and FDA are developed to model A and B can be used to confirm the occurrence of the faults. detect and diagnose the multiple faults including fixed bias, drift- At the same time, the multi-level detection module can pre-diag- 3664 Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665

120

Drifting:Tsup(0.01C/min) and Mfre(0.0012kg/s/min),10AM occurred

Drifting:Tsup(-0.01C/min) and Msup(-0.004kg/s/min),11AM occurred 90 Threshold:3.1713

60 SPE

30

0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time (a) Fault detection based on PCA model I

15 3 Drifting:Tsup(0.01C/min) and Mfre(0.0012kg/s/min),10AM occurred Drifting:Tsup(0.01C/min) and Mfre(0.0012kg/s/min),10AM occurred

Drifting:Tsup(-0.01C/min) and Msup(-0.004kg/s/min),11AM occurred Drifting:Tsup(-0.01C/min) and Msup(-0.004kg/s/min),11AM occurred 12 Threshold:1.1418 Threshold(0.0780) 2 9 SPE SPE

6 1

3

0 0 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 9:00 AM 10:00 AM 11:00 AM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time (b) Fault detection based on PCA model A (c) Fault detection based on PCA model B

Fig. 13. Fault detection for Cases 7 and 8.

120 Normal 60000 10000 Tsup Twr

Tws 8000 90 Mw Mfre Mrtn 6000 40000 Msup

60 4000 MD MD

2000 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 20000 30

0 0 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM 12:00 PM 1:00 PM 2:00 PM 3:00 PM 4:00 PM Time Time

(a) Fault diagnosis in Tsup control loop (b) Fault diagnosis in Mfre control loop

Fig. 14. Fault diagnosis for Case 7 (a) fault diagnosis in Tsup control loop, (b) fault diagnosis in Mfre control loop. nose the multiple faults into local control loops so as to improve loop. After a series of Fisher transformation, the classes including the diagnosing process. normal and faulty operation can be re-arrayed optimally in the With the pre-diagnosis results, moreover, FDA is developed to transformed data space and as a result they can be separated isolate the fault source one by one in each corresponding control through maximizing the scatter between classes while minimizing Z. Du, X. Jin / Energy Conversion and Management 49 (2008) 3654–3665 3665 the scatter within classes. Comparing the Mahalanobis distance [12] Lee WY, House JM, Shin DR. Fault diagnosis and temperature sensor recovery (MD) for all the candidates, the fault source can be identified for an air-handling unit [J]. ASHRAE Trans 1997;103(1):621–33. [13] Wang SW, Chen YM. Fault-tolerant control for outdoor ventilation air flow rate through selecting the least one. in building based on neural network[J]. Build Environ 2002;37(7):691–704. [14] Stylianou M, Nikanour D. Performance monitoring, fault detection, and Acknowledgement diagnosis of reciprocating chillers [J]. ASHRAE Trans 1996;102(1):615–27. [15] Wang SW, Wang JB. Robust sensor fault diagnosis and validation in HVAC systems[J]. Trans Inst Measur Control 2002;24(3):231–62. This research was supported by China Postdoctoral Science [16] Wang SW, Xiao F. AHU sensor fault diagnosis using principal component Foundation (No. 20070410180). analysis method [J]. Energy Build 2004;36:147–160a. [17] Wang SW, Qin JY. Sensor fault detection and validation of vav terminals in air- conditioning systems[J]. Energy Conv Manag 2005;46(15-16):2482–500. References [18] Wang SW, Cui JT. A robust fault detection and diagnosis strategy for centrifugal chillers[J]. HVAC R Res 2006;12(3):407–28. [1] Hyvarnen J et al. IEA ANNEX 25, building optimization and fault diagnosis [19] Jin XQ, Du ZM. Fault tolerant control of outdoor air and AHU supply air source book. Paris: International Energy Agency; 1995. temperature in VAV air conditioning systems using PCA method [J]. Appl [2] Dexter L, Pakanen J. Demonstrating automated fault detection and diagnosis Therm Eng 2006;26(11-12):1226–37. methods in real buildings, VTT Building Technology, Finland (ISBN 951-38- [20] Zhimin Du, Xinqiao Jin. Fault detection and diagnosis based on improved PCA 5726-3), ANNEX 34 Final Report; 2001, IEA. with JAA method in VAV systems[J]. Build Environ 2007;42(9): [3] Piette MA, Kinney SK, Philip H. Analysis of an information monitoring and 3221–32. diagnostic system to improve building operation. Energy Build 2001;33(8): [21] Zhimin Du, Xinqiao Jin. Detection and diagnosis for sensor fault in HVAC 783–91. systems[J]. Energy Conv Manag 2007;48(3):693–702. [4] Comstock MC, Braun JE. Development of analysis tools for the evaluation of [22] Xiao F, Wang SW. A diagnostic tool for online sensor health monitoring in air- fault detection and diagnostics in chillers. Report #HL99-20. Purdue conditioning systems[J]. Autom Construct 2006;15:489–503. University, Ray W. Herrick Laboratories, West Lafayette, IN; 1999. [23] Hart PE, Duda RO, Stork DG. Pattern classification. 2nd ed. New York: John [5] Peitsman H, Bakker VE. Application of black-box models to HVAC systems for Wiley & Sons; 2001. fault detection. ASHRAE Trans 1996;102(2):628–40. [24] Dunia Ricardo, Joe Qin S. Joint diagnosis of process and sensor faults using [6] Rossi TM, Braun JE. A statistical rule-based fault detection and diagnostic principal component analysis. Control Eng Pract 1998;6:457–69. method for vapor compression air conditioners. Int J Heat Ventil, Air Cond [25] Misra Manish, Henry Yue H, Joe Qin S. Multivariate process monitoring and Refrig Res 1997;3(1):19–37. fault diagnosis by multi-scale PCA. Comput Chem Eng 2002;26:1281–93. [7] Yoshida H, Iwami T, Yuzawa H, Suzuki M. Typical faults of air-conditioning [26] Jackson JE, Mudholkar GS. Control procedures for residuals associated with systems and fault detection by ARX model and extended kalman filter. principal components analysis. Technometrics 1979;21:341–9. ASHRAE Trans 1996;102(1):557–64. [27] Jolliffe IT. Principal component analysis. NewYork: Springer-Verlag; 1986. [8] Lee WY, Park C, Kelly GE. Fault detection in anair-handling unit using residual [28] Edward J. User’s guide to principal components. Wiley; 1991. and recursive parameter identification methods. ASHRAE Trans 1996;102(2): [29] Chiang LH, Russell EL, Braatz RD. Fault diagnosis and Fisher discriminant 528–39. analysis, discriminant partial least squares, and principal component analysis. [9] Ngo D, Dexter AL. A robust model-based approach to diagnosing faults in air- Chemometric Intell Lab Syst 2000;50:243–52. handling units. ASHRAE Trans 1999;105(1):1078–86. [30] Chiang LH, Kotanchek ME, Kordon AK. Fault diagnosis based on Fisher [10] House JM, Vaezi-Nejad H, Whitcomb JM. An expert rules set for fault detection discriminant analysis and support vector machines [J]. Comput Chem Eng in air handling units/discussion. ASHRAE Trans 2001;107(1):858–71. 2004;28:1389–401. [11] Dexter AL, Ngo D. Fault diagnosis in HVAC systems: a multi-step fuzzy model- [31] Jin XQ. Study on simulation of VAV air-conditioning system and online optimal based approach. Int J HVAC R Res 2001;7(1):83–102. control [D]. Shanghai: Shanghai Jiaotong University; 1999.