Intelligent Search of Correlated Alarms for GSM Networks with Model-based Constraints*
Qingguo Zheng Ke Xu Weifeng Lv Shilong Ma National Lab of Software Development Environment Department of Computer Science and Engineer Beijing University of Aeronautics and Astronautics, Beijing 100083 {zqg, kexu, lwf , slma}@nlsde.buaa.edu.cn
to control the process of data mining, make the Abstract process of data mining more focus, and reduce the number of rules being generated, many researchers In order to control the process of data mining and [9,10,11] have studied putting various constraints, focus on the things of interest to us, many kinds of including Boolean expressions, regular expressions, constraints have been added into the algorithms of and aggregation constraints etc., into the methods of data mining. However, discovering the correlated mining association rules. But the analysis of alarm alarms in the alarm database needs deep domain correlation will need deep domain constraints into the constraints. Because the correlated alarms greatly method of data mining, for the rules of alarm depend on the logical and physical architecture of correlation are closely related to the logical and networks. Thus we use the network model as the physical architecture of networks. constraints of algorithms, including Scope constraint, Model-Based Reasoning (MBR) was originally Inter-correlated constraint and Intra-correlated proposed in [12], the principles of which have been constraint, in our proposed algorithm called SMC widely applied in the analysis of alarm correlation (Search with Model-based Constraints). The [1,13,14,15,16,17]. MBR system consists of a experiments show that the SMC algorithm with network model and a function model. In the case of Inter-correlated or Intra-correlated constraint is about telecommunication networks, network model is two times faster than the algorithm with no composed of network topology and network constraints. elements. In this paper, we propose the intelligent search of Keywords correlated alarms with model-based constraints, Alarm correlation, model-based diagnosis, sequence called SMC (Search with Model-based Constraints), mining, fault management, network management which use network configuration model as constraints to discover alarm correlation rules from the alarm 1. Introduction database. We adopt three types of constraints, i.e. Scope constraint, Intra-correlated constraint and Alarm correlation [1] plays an important role in Inter-correlated constraint, according the network isolating the root cause of problems from a stream of model of GSM networks. The experiments in this alarms for networks. Modern telecommunication paper show that the algorithm of Inter-correlated networks generate a massive number of alarms, constraint is faster than the algorithm of which make the analysis of alarm correlation more Intra-correlated constraint. The algorithms of difficult than before. Many researchers look for a Inter-correlated constraint and Intra-correlated new method to solve the problem. constraint are about two times faster than the Mannila et al. [2] first studied the methods of algorithm with no constraints. mining frequent episodes in alarm sequences, which The rest of this paper is organized as follows. In has been applied into TASA system [3]. Subsequently, Section 2 we give the related works in the network many researchers [4,5] have studied adopting data management. In Section 3 we first briefly describe mining methods [2,6,7,8] to make analysis of alarm the architecture of GSM networks and then give the correlation. However, the methods of data mining for network configuration of GSM networks and define alarm correlation will generate too many rules, which the Scope, intra-correlated alarms and inter-correlated still need network administrator to distinguish the alarms. Section 4 gives the definitions of the useful rules from the results of data mining. In order algorithm of searching correlated alarms. In Section 5
* This research was supported by National 973 Project of China Grant No. G1999032709 and No.G1999032701
1 we propose the intelligent search of correlated alarms categories. After that they proposed a Mining with model-based constraints, called SMC (Searching algorithm called CAP, which achieves a maximized based on Model constraints). The mode-based degree of pruning for all categories of constraints. constraints include Scope constraint, Inter-correlated Garofalakis et al. [11] proposed the use of Regular constraint and Intra-correlated constraint. In Section Expressions as a flexible constraint specification tool 5 experiments show that the SMC algorithm with that enables user-controlled focus to be incorporated Inter-correlated or Intra-correlated constraint is about into the pattern mining process, and developed a two times faster than that with no constraints. Finally, family of novel algorithms called SPRIIT (sequential we make conclusions in Section 6. pattern mining with regular expression constraints) for mining frequent sequential patterns under 2. Related work constrains. These constraints above are too general to be fit for the analysis of alarm correlation in the In the past, the principles of model-based reasoning network management. So some deep constraints of [12] have been widely used in network management telecommunications networks are needed to discover [1,13,14,15,16,17]. Fröhlich and Nejdl [13] the correlated alarms. introduced a model-based solution to the problem of Some groups of network management also alarm correlation for cellular phone networks (GSM). studied the layered model of networks [18,19,20]. They proposed an abstract simulation to identify the Malheiros and Marcos Silva [18] presented a cause of the propagation of alarm messages through proposal of a general model for telecommunication the network by comparing the predicted behavior network based on distributing sets of sub-networks with the actual alarm pattern. It is very difficult to into layers, according to functional affinity criteria. build the relevant functional, causal and structural The proposed model uses an acyclic directed graph to semantics of model of GSM networks, because the represent a telecommunication network and is being system of GSM networks are very complex, used in network management applications, especially including the wireless networks and fixed networks, in alarm correlation. Appleby et al. [19] have and GSM networks evolve quickly, which now developed a mode-based event correlation engine, migrate into 2.5 Generation i.e. GPRS. So the SMC called Yemanja, for multi-layer fault diagnosis. algorithm uses the network model as constraints to Yemanja uses the entity-models to encapsulate each discover the correlated alarms from the database. device and conceptual layer. The entity-models Furthermore, the SMC algorithm adopts the concept contain a set of scenarios that embody its problem of GSM networks rather than the network behavior. Barros et al. [20] presented a diagnostic configuration as constraints and so avoids the approach based on a complete representation of the problem of frequent changes of GSM networks. network in its logical and physical levels, which have Sabin et al. [16] proposed a constraint-based been called the network Configuration Model. The model and a problem-solving approach to network Layered model is fit for the changes of networks, fault management. The approach is based on the which is an overview from the whole network. So we concept of constraints which represent relations use the concept network configuration model of the among network components. Subsequently, Sabin et GSM network as the constraints of SMC (Search with al. [17] applied an extension to the constraint Model-based constraints) algorithm. satisfaction paradigm, called composite constraint satisfaction, to facilitate modeling of complex 3. Problem definition systems and demonstrate the applicability of their approach on an example of a basic groupware service, 3.1 The GSM networks namely, distributed database replication. The PSTN/ISDN constraints of mode-based algorithms are only MSC obtained from the expert, which is a bottleneck to the constraints-based model methods. MSC MSC MSC Switched Network Many researchers of data mining also studied A Interface adding some constraints to the algorithms of data BSC BSC BSC mining [9,10,11]. Srikant et al. [9] first studied the Access problem of discovering association rules in the BTS BTS BTS Network present of constraints that are Boolean expressions over the presence of absence of items. Integrating Um Interface such constraints as a post-processing step can greatly reduce the execution time. Raymond et al. [10] Mobile introduced and analyzed two properties: Station anti-monotonicity and succinctness, and developed characterizations of various constraints into four Figure 1. GSM Network
2 GSM is a second-generation, digital, TDMA-based, 3.2 GSM Network Configuration model cellular stand with extensive coverage in Europe, The concept of GSM networks topology can be including seamless coverage across European considered as a tree. The first level of tree i.e. the root countries. Now GSM has been widely used in over 50 of tree is PLMN (Public land Mobile Network). The countries in the world. second level of tree is MSC (Mobile Service A GSM network is composed of several Switching Center) level, which corresponds to the functional entities, with defined functions and Switched network in Figure 1. The third level of tree interfaces [24]. Figure 1 shows the layout of a generic is BSC (Base Station Controller). The fourth level of GSM network. The GSM network can be divided into tree is BTS level. The adjacent levels are connected three broad parts: Mobile Station, Access Network by Data Link Channel, descried in Figure 2. and Switched Network.. We can describe the concept of GSM networks The Mobile Station is carried by the subscriber. topology illustrated in Figure 3 by 2-tuples: The Base Station Subsystem controls the radio
Reverse CC
Forward CC MSC MSC
Reverse VC BSS
MSC
Mobile Mobile Forward VC BSC BSC
Data Link BTS BTS (9.6 Kbps)
Figure 2. Mobile/BAS/MSC interconnection Figure 4. The structure of GSM
3 If the correlated alarms come from the same class then Qi=(( ek1), ti)=( < ek1, ti>)= Ek1. of network element, then the degree of correlation 2.) The length of alarm tuple is among alarms is high. The alarms that are tightly |Qi|=|( ek1, ek2, ……, ekm)|=m, which is the correlated may indicate the mutual effect of number of the alarm types contained in the components of network facility. If the correlated alarm tuple. alarms come from the different class of network Definition 4. An alarm queue and its length elements, then the degree of correlation among 1.) An alarm queue is defined as Sij=
Qi=( ( ek1, ek2, …… , ekm), ti ), i, m=1,2,3,…; ∆ t k1,k2,…,km=1,2,3,… , where ‘ek1, ek2,……, e , e ,…,e ⇒ e , e ,…, e i1 i2 ij ik ik+1 im ekm’ are the alarm types which concurrently [conf=q%, supp=p% ,W ] k occur at the time ti . An alarm tuple can be represented by an alarm event i.e. Q =(< e , t >,< After the alarm types ei1, ei2,…,eij occur, in the i k1 i interval of ∆t, the probability of alarm type sequence ek2, ti> ,……,
4 4. Intelligent Search Algorithm After the above analysis, we propose the with Model-based Constraints intelligent search of correlated alarms with model-based constraint, called SMC (Search with The different classes of alarm correlation often have Model-based Constraints),which consists of different support values in the alarm database. Some Algorithm 1, Algorithm 2, Algorithm 3 and may be high, while others may be low. In many cases Algorithm 4. we only have interest in one class of alarm The Algorithm 1 is the main part of the SMC correlation. If we don’t add the constraints to filter algorithm. Algorithm 1 is based on the framework of out these uninteresting classes of alarm correlation, the algorithms [6,7,8] and composed of Algorithm 2 the number of candidates in the algorithm will be and Algorithm 3. The Algorithm 2 is the very large. Thus the computing time of data mining Robust_search algorithm in [23], which search the will grow exponentially. But if we add the constraints times of an alarm type sequence occurring in the to filter out these uninteresting correlated alarms, we given alarm queue. Algorithm 3 generates the can greatly enhance the performance of mining frequent alarm type sequences F_ALARMm+1 from correlated alarms. the candidate alarm sequences Cm. As mentioned in Section 3.2, the correlated The constraints, i.e. Scope constraint, alarms in the GSM network are divided into two Inter-correlated constraint and Intra-correlated groups: inter-correlated alarms and intra-correlated constraint are in line 8 of Algorithm 1. We use these alarms. Because the inter-correlated alarms are constraints to prune the candidates that don’t satisfy loosely correlated, the support of inter-correlated the constraints mentioned above. Thus, we can alarms may be low. For the reason that the reduce the number of frequent alarm type sequences Intra-correlated alarms are tightly correlated, the and so reduce the computing time of the algorithm. values of support of Inter-correlated alarms may be By use of the constraints, we could control the high. If we have interest in inter-correlated alarms in process of mining alarms and focus on mining the the GSM networks, we can use the Inter-correlated sequences, which are interesting to us. constraint in the algorithm and don’t need to worry The Algorithm 4 generates the correlated rules about the computing time of data mining. It is the from the frequent alarms. The interesting measure of same thing for Intra-correlated constraint. correlation rules in the frequent pattern mining was Furthermore, we also use the Scope as the constraint first studied by Brin et al.[21]. Later K. M. Ahmed et of the algorithm to discover correlated alarms in the al. [22] improved the interesting measure that was specified part of GSM networks. presented by Brin et al. we use the interesting measure i.e. |P(XY)/P(X)-(Y)| of correlation rule The Constraints definition based on the X⇒Y in [22] as the one in Algorithm 4. configuration model of GSM 1.) Scope constraint: the area of the networks Algorithm 1
under the specified network element. Given the Input: alarm queue (Sij, Wk)
specified object_class i.e. sp_class, an alarm Output: t frequent alarm sequence set: F_ ALARMm sequence set Cm and an alarm sequence ∈ . seqm=
belongs to the different classes of network 5. For all α∈Cm , Search alarm queue Sij to find support(α, Wk); elements. Given an alarm sequence set Cm and /*Algorithm 2 */
an alarm sequence seqm=
ey.object_class) }. 7. Generate Candidate Cm+1 from F_ALARMm; /* Algorithm 3 */ 3.) Intra-correlated constraint: the alarms in the 8. m=m+1; correlation rules must belong to the same class 9. end.
of network element i.e. BTS or BSC or MSC. 10. for all m , output F_ALARMm; Given an alarm sequence set Cm and an alarm sequence seq =
5 Alarm_sequence_length=2 750